Network information methods devices and systems

ABSTRACT

Methods and systems for predicting links in a network, such as a social network, are disclosed. The existing network structure can be used to optimize link prediction. The methods and systems can learn a distance metric and/or a degree preference function that are structure preserving to predict links for new/existing nodes based on node properties.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of application Ser. No. 13/707,478filed Dec. 6, 2012, the content of which is hereby incorporated byreference in its entirety, where application Ser. No. 13/708,478 claimsthe benefit of U.S. Provisional Application No. 61/567,518 filed Dec. 6,2011, the content of which is hereby incorporated by reference in itsentirety.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

The present invention was made with government support under grantnumbers N66001-09-C-0080 awarded by the Department of Homeland Security(DHS) and IIS-1117631 by the National Science Foundation (NSF). The U.S.government has certain rights in the invention.

BACKGROUND

Many real-world networks are described by both connectivity informationand features for every node. Many social networks are of this form; onservices such as Facebook, Twitter, or LinkedIn, there are profileswhich describe each person. In addition, participants communicate andtransact with each other as well. Other examples such as etsy.com inwhich buyers find small vendors in a large framework are similar tosocial media. Sites such as reddit.com allow users to find links tomedial or comments, providing a framework that could be improved byallowing users to find material they find most interesting. Theproliferation of social networks on the web has spurred interest in thedevelopment of recommender systems to increase the value derived byparticipants. There exist challenges in making recommendations based onuser information and their activities because people form relationshipsfor a variety of reasons. For example, in Facebook perhaps they sharesimilar parts of their profile such as their school or major, or perhapsthey have completely different profiles. There is an on-going need forimprovements in this area. In addition there are many systems, such asreddit and etsy, which provide a decent framework that are susceptibleto improvement by providing a good recommendation system.

SUMMARY

Embodiments of the disclosed subject matter include systems, devices,and methods that employ existing network data including node featuresand structural characteristics (links) of a network, to predict desired,expected, most preferred, recommended, likelihood of new links. Forexample, a network of friendships, each may define a link in apopulation and the characteristics of the individuals such as height,preferred sport, gender, age, etc. would form a feature set. Thesefeature sets and links may be used to train a machine learning enginethat can then predict, for an individual characterized by a new featureset, one or more friendships (“links”) that would be desired by him,expected to arise, most preferred by the individuals were they tobefriend, recommended by the individuals, or likelihood). Essentially,the link information is used as a latent measure of the value of thepairings embodied by the pairings. In the presently disclosed subjectmatter, the value of the pairings may incorporate latent factors thatinvolve pairings that are not just local to the individual pair and thefeatures at each end of the link defined by the pair. That is, there maybe latent values expressed in the extended network, the neighborhood orthe entire network that should affect the prediction of a new link asthey affect the existence of the link in the network used to train themachine learning prediction engine. Thus, there is information thenetwork outside the pair that should affects a prediction engine'sestimate of a desired, expected, most preferred, recommended, likelihoodor value of a friendship forming between a given pair of individuals.Networks to which such a prediction may be applied are varied but couldinclude networks of products linked with purchasers, social media sites,dating sites, Twitter, Facebook, LinkedIn, an orientation service fortransferees or new students for a school, etc.

In the examples and other networks, the disclosed subject matterprovides a prediction engine that applies a distance metric that islearned from one or more example networks with established links andnodes characterized by feature vectors. Systems and methods forestimating distance metrics for a network, which network ischaracterized by connectivity information and features for each node,are described. The systems and methods permit link prediction using boththe node features and existing network connections. The method employs astructure-preserving predictor, by which it is meant that given an inputnetwork having unique nodes, a set of distance metrics between the nodesmay be generated which completely preserves the structural (link)information in the network. Thus, the distance metric data can be usedto reconstruct the network substantially or, depending on resource costconsiderations or other factors, perfectly. The extraction of such datafrom an existing network is called structure preserving metric learningor SPML. The extraction of predicted links from an SPML from an existingnetwork which include limiting to an actual degree of connectivity(i.e., connectivity of the training network is also preserved orrecovered from the node feature data) is identified here as degreedistribution metric learning or DDML. In DDML, in addition to learning astructure preserving distance metric, a degree prediction function isalso learned that can predict the number of links a node is likely to,or should have based on node features. In a friendship network, forexample, the recommender is enabled not only to measure the goodness ofvarious possible new friendships, but also, for a given person, how manyfriendships should ultimately attach to a given person.

In embodiments, methods for SPML and SML/DDML combine linear constraintsthat require graph structure to be preserved with a Frobenius normregularizer on a distance metric and a regularization parameter tocreate a semidefinite program (SDP) that learns the distance metric,which is structure preserving. Preserving graph topology may be done byenforcing linear constraints on distances between nodes. The linearstructure preserving constraints for metric learning used by SPML/DDMLenforce that neighbors of each node are closer than most others. Givenan input network having unique nodes, SPML/DDML learns a distance metricbetween nodes that preserve the structural information in the network.

Methods disclosed herein can improve the efficiency of SPML/DDML byoptimizing the method based on stochastic gradient descent (SGD) whichremoves the running-time dependency on the size of the network andallows the method to easily scale to networks of thousands of nodes andmillions of edges. In addition the methods disclosed herein may besuitable for parallelization and cloud-computing implementation.

The disclosed subject matter can be used in systems for providingimproved prediction of new connections to users of social networkingservices, including internet based services (e.g. Facebook, LinkedIn,and Twitter). The disclosed subject matter can be used in systems forproviding improved link prediction for documents included in an onlinedocument collection, such as a wiki online service (e.g. Wikipedia). Thedisclosed subject matter can also improve related product predictionsprovided by online retailers to users that have viewed a product'swebpage.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments will hereinafter be described in detail below with referenceto the accompanying drawings, wherein like reference numerals representlike elements. The accompanying drawings have not necessarily been drawnto scale. Where applicable, some features may not be illustrated toassist in the description of underlying features.

FIG. 1A is a block diagram of an exemplary embodiment of a structurepreserving metric learning (SPML/DDML) link prediction system accordingto some embodiments of the disclosed subject matter.

FIG. 1B is a flowchart showing an exemplary embodiment of a structurepreserving metric learning (SPML/DDML) link prediction method accordingto some embodiments of the disclosed subject matter.

FIG. 2 is a flowchart showing an exemplary embodiment of a structurepreserving metric learning (SPML/DDML) connection prediction methodaccording to some embodiments of the disclosed subject matter.

FIG. 3 is a diagram of an exemplary embodiment of a structure preservingmetric learning (SPML/DDML) prediction system according to someembodiments of the disclosed subject matter.

FIG. 4 is a flowchart showing an exemplary method of SPML/DDML linkprediction according to some embodiments of the disclosed subjectmatter.

FIG. 5 is a flowchart showing an exemplary method of SPML/DDML linkprediction according to some embodiments of the disclosed subjectmatter.

FIG. 6 is a flowchart showing an exemplary method of DDML link degreeprediction according to some embodiments of the disclosed subjectmatter.

FIG. 7 is a flowchart showing an exemplary method of SPML/DDML linkprediction using network partitioning according to some embodiments ofthe disclosed subject matter.

FIG. 8 is a block diagram of an exemplary embodiment of a distributedstructure preserving metric learning (SPML/DDML) link prediction systemaccording to some embodiments of the disclosed subject matter.

FIG. 9 is a block diagram of a system for predicting friendships to newusers of a social network using SPML/DDML according to some embodimentsof the disclosed subject matter.

FIG. 10 is a block diagram of a system for predicting friendshipsbetween users of a social network using SPML/DDML according to someembodiments of the disclosed subject matter.

FIG. 11 is a block diagram of a system for predicting links to newdocuments added to an information network using SPML/DDML according tosome embodiments of the disclosed subject matter.

FIG. 12 is a block diagram of a system for predicting links betweendocuments in an information network using SPML/DDML according to someembodiments of the disclosed subject matter.

FIG. 13 is a block diagram of a system for predicting connections to newmembers joining a dating service using SPML/DDML according to someembodiments of the disclosed subject matter.

FIG. 14 is a block diagram of a system for predicting connectionsbetween members in a dating service using SPML/DDML according to someembodiments of the disclosed subject matter.

FIG. 15 is a block diagram of a system for recommending products to newusers of a shopping service using SPML/DDML according to someembodiments of the disclosed subject matter.

FIG. 16 is a block diagram of a system for recommending products tousers of a shopping service using SPML/DDML according to someembodiments of the disclosed subject matter.

FIG. 17 is a block diagram of an exemplary embodiment of a structurepreserving distance-metric learning link prediction system according tosome embodiments of the disclosed subject matter.

FIG. 18 illustrates synthetic SPML experiment results.

FIG. 19 illustrates Wikipedia and Facebook experiment results.

FIG. 20 provides a comparison of Facebook social networks from fourschools in terms of feature importance computed from the learnedstructure preserving metric.

FIG. 21 illustrates a ROC curve for various algorithms on the“philosophy concepts” category.

FIG. 22 illustrates the performance of low-rank SPML on training datavarying the rank parameter, run on a single Facebook school.

DETAILED DESCRIPTION OF THE DRAWINGS AND EMBODIMENTS

Embodiments of the disclosed subject matter relate generally to methodsand systems for distance-metric learning using a network described byboth connectivity information and features for each node and for linkprediction using node features and the learned distance metric. Inembodiments a degree prediction function is also learned to predict,based on node features, the number of links a node is likely to have.

The proliferation of social networks on the web has spurred manysignificant advances in modeling networks. However, while many effortshave been focused on modeling networks as weighted or unweighted graphs,or constructing features from links to describe the nodes in a network,few techniques have focused on real-world network data which consists ofboth node features in addition to connectivity information. Many socialnetworks are of this form; on services such as Facebook, Twitter, orLinkedIn, there are profiles which describe each person, as well as theconnections they make. The relationship between a node's features andconnections is often not explicit. For example, people “friend” eachother on Facebook for a variety of reasons: perhaps they share similarparts of their profile such as their school or major, or perhaps theyhave completely different profiles. Various embodiments of the disclosedsubject matter can learn the relationship between profiles and linksfrom massive social networks such that these embodiments can betterpredict who is likely to connect. To model this relationship, one couldsimply model each link independently, where one simply learns whatcharacteristics of two profiles imply a possible link. However, thisapproach ignores the structural characteristics of the links in thenetwork. Modeling independent links likely is insufficient, and in orderto better model these networks one should account for the inherenttopology of the network as well as the interactions between the featuresof nodes. Various embodiments of the disclosed subject matter thereforeperform structure preserving metric learning (SPML) and/or degreedistribution metric learning (DDML), methods for learning a distancemetric between nodes that preserves the structural network of data usedto learn the metric.

Some known metric learning algorithms, applied to supervised learningtasks such as classification, first build a k-nearest neighbors (kNN)graph from training data with a fixed k, and then optimize a metric togenerate a class label for a new point by a majority vote of nearbypoints. The metric is optimized based on the goal of keeping connectedpoints with similar labels (same or similar class) close while pushingaway those of different class—class impostors. Points which areconnected but which belong to different classes may be pushed away.Fundamentally, these supervised methods aim to learn a distance metricsuch that applying a connectivity algorithm (for instance, k-nearestneighbors) under the metric will produce a graph where no point isconnected to others with different class labels. In practice, theseconstraints are enforced with slack. Once the metric is learned, theclass label for a new data point can be predicted by the majority voteof nearby points under the learned metric.

Unfortunately, some of these metric learning algorithms are not easilyapplied when a network is given as input instead of class labels foreach point. Under such a regime, SPML and DDML learn a metric such thatpoints connected in the network are close and points which areunconnected are more distant. Intuitively, certain features or groups offeatures should influence how nodes connect, and thus it should bepossible to learn a mapping from features to connectivity such that themapping respects the underlying topological structure of the network.Like some previous metric learning methods, SPML and DDML learn a metricwhich reconciles the input features with some auxiliary information suchas class labels. In this case, instead of pushing away class impostors,SPML and DDML push away graph impostors—points which are close in termsof distance but which should remain unconnected—ultimately preservingthe topology of the network. Thus SPML and DDML learn a metric where thelearned distances are inherently tied to the original inputconnectivity.

Preserving graph topology is possible by enforcing simple linearconstraints on distances between nodes. By adapting the constraints fromthe graph embedding technique structure preserving embedding, variousembodiments of the disclosed subject matter formulate simple linearstructure preserving constraints for metric learning that enforce thatneighbors of each node are closer than all others. Furthermore, variousembodiments of the disclosed subject matter adapt these constraints foran online setting similar to PEGASOS and OASIS, such that SPML and/orDDML can be applied to large networks by optimizing with stochasticgradient descent (SGD).

Structure Preserving Metric Learning (SPML)

Given as input an adjacency matrix A∈

^(n×n) and node features X∈

^(d×n), structure preserving metric learning (SPML) learns a Mahalanobisdistance metric parameterized by a positive semidefinite (PSD) matrix M∈

^(d×d), where M

0. The distance between two points under the metric is defined as

D _(M)(x _(i) ,x _(j))=(x _(i) −x _(j))^(T) M(x _(i) −x _(j))  (1)

When the metric given by the identity M=I_(d), D_(M)(x_(i), x_(j))represents the squared Euclidean distance between the ith and jthpoints. Learning M is equivalent to learning a linear scaling on theinput features LX where M=L^(T)L and L∈

^(d×d). SPML learns an M which is structure preserving, as defined inDefinition 1. Given a connectivity algorithm

, SPML learns a metric such that applying

to the input data using the learned metric produces the input adjacencymatrix exactly (

is interchangeably used herein to denote the set of feasible graphs andthe algorithm used to find the optimal connectivity within the set offeasible graphs). Possible choices for

include, for example, maximum weight b-matching, k-nearest neighbors,ε-neighborhoods, or maximum weight spanning tree.

Definition 1: Given a graph with adjacency matrix A, a distance metricparameterized by M∈

^(d×d) is structure preserving with respect to a connectivity algorithm

, if

(X, M)=A.

Preserving Graph Topology with Linear Constraints

To preserve graph topology, the same linear constraints as structurepreserving embedding (SPE) are used, but they are applied to M, whichparameterizes the distances between points. A useful tool for definingdistances as linear constraints on M is the transformation

D _(M)(x _(i) ,x _(j))=x _(i) ^(T) Mx _(i) +x _(j) ^(T) Mx _(j) −x _(i)^(T) Mx _(j) −x _(j) ^(T) Mx _(i)  (2)

which allows linear constraints on the distances to be written as linearconstraints on the M matrix. For different connectivity schemes below,linear constraints are presented which enforce graph structurepreservation.

Nearest Neighbor Graphs

The k-nearest neighbor algorithm (k-nn) connects each node to the kneighbors to which the node has the shortest distance, where k is aninput parameter; therefore, setting k to the true degree for each node,the distances to all disconnected nodes must be larger than the distanceto the farthest connected neighbor:

D _(M)(x _(i) ,x _(j))>(1−A _(ij))max_(l)(A _(il) D _(M)(x _(i) ,x_(l))),∀i,j  (3)

Similarly, preserving an ε-neighborhood graph obeys linear constraintson

M:D _(M)(x _(i) ,x _(j))≦ε,∀{i,j|A _(ij)=1}, and

D _(M)(x _(i) ,x _(j))≧ε,∀{i,j|A _(ij)=0}  (4)

if for each node the connected distances are less than the unconnecteddistances (or some ε), i.e., the metric obeys the above linearconstraints, Definition 1 is satisfied, and thus the connectivitycomputed under the learned metric M is exactly A.

Maximum Weight Subgraphs

Unlike nearest neighbor algorithms, which select edges greedily for eachnode, maximum weight subgraph algorithms select edges from a weightedgraph to produce a subgraph which has total maximal weight. Given ametric parameterized by M, let the weight between two points (i,j) bethe negated pairwise distance between them:

Z _(ij) =−D _(M)(x _(i) ,x _(j))=−(x _(i) −x _(j))^(T) M(x _(i) −x_(j))  (6)

For example, maximum weight b-matching finds the maximum weight subgraphwhile also enforcing that every node has a fixed degree bi for each ithnode. The formulation for maximum weight spanning tree is similar.Unfortunately, preserving structure for these algorithms requiresenforcing many linear constraints of the form:

tr(Z ^(T) A)≧tr(Z ^(T) Ã),∀Ã∈

  (7)

This reveals one critical difference between structure preservingconstraints of these algorithms and those of nearest-neighbor graphs:there are exponentially many linear constraints. To avoid an exponentialenumeration, the most violated inequalities can be introducedsequentially using a cutting-plane approach as shown in the nextsection.

Algorithm Derivation

By combining the linear constraints from the previous section with aFrobenius norm (denoted ∥·∥_(F)) regularizer on M and regularizationparameter λ, we have a simple semidefinite program (SDP) which learns anM that is structure preserving and has minimal complexity. Algorithm 1summarizes the naive implementation of SPML when the connectivityalgorithm is k-nearest neighbors, which is optimized by a standard SDPsolver. For maximum weight subgraph connectivity (e.g., b-matching), acutting-plane method can be used, iteratively finding the worstviolating constraint and adding it to a working-set. The most violatedconstraint at each iteration can be found by computing the adjacencymatrix Ã that maximizes tr({tilde over (Z)}Ã) s.t. Ã∈

, which can be done using various published methods.

See for example, C. Fremuth-Paeger and D. Jungnickel, Balanced networkflows, a unifying framework for design and analysis of matchingalgorithms. Networks, 33(1):1-28, 1999.; B. Huang and T. Jebara, Loopybelief propagation for bipartite maximum weight b-matching, Proc. 11thIntl. Conf. on Artificial Intelligence and Statistics; and/or B. Huangand T. Jebara, Fast b-matching via sufficient selection beliefpropagation; Proc. of the 14^(th) Intl Conf. on Artificial Intelligenceand Statistics, 2011.

Each added constraint enforces that the total weight along the edges ofthe true graph is greater than total weight of any other graph by somemargin. Algorithm 2 shows the steps for SPML with cutting-planeconstraints.

Algorithm 1: Structure Preserving Metric Learning with Nearest NeighborConstraints

Input:  A ∈ ^(n × n), X ∈ ℝ^(d × n), and   parameter  λ1:   = {M  ≽ 0, D_(M)(x_(i), x_(j)) ≥ (1 − A_(ij))max_(l)(A_(il)D_(M)(x_(i), x_(l))) + 1 − ξ∀_(i, j)}$\left. {2\text{:}\mspace{11mu} \overset{\sim}{M}}\leftarrow{{\arg \; {\min_{M \in }{\frac{\lambda}{2}{M}_{F}^{2}}}} + {\xi \left\{ {{Found}\mspace{14mu} {via}\mspace{20mu} {SDP}} \right\}}} \right.$$3\text{:}\mspace{14mu} {return}\mspace{14mu} \overset{\sim}{M}$

Algorithm 2: Structure Preserving Metric Learning with Cutting-PhaseConstraints

Input:  A ∈ ^(n × n), X ∈ ℝ^(d × n), connectivity  algorithm  , and  parameters  λ, k  1:   = {M ≽ 0}   2:  repeat$\mspace{11mu} \left. {3\text{:}\mspace{31mu} \overset{\sim}{M}}\leftarrow{{\arg \; {\min_{M \in }{\frac{\lambda}{2}{M}_{F}^{2}}}} + {\xi \left\{ {{Found}\mspace{14mu} {via}\mspace{14mu} {SDP}} \right\}}} \right.$$\mspace{11mu} \left. {4\text{:}\mspace{31mu} \overset{\sim}{Z}}\leftarrow{{2X^{T}\overset{\sim}{M}X} - {{{diag}\left( {X^{T}\overset{\sim}{M}X} \right)}1^{T}} - {1\; {{diag}\left( {X^{T}\overset{\sim}{M}X} \right)}^{T}}} \right.$$\mspace{11mu} {\left. {5\text{:}\mspace{31mu} \overset{\sim}{A}}\leftarrow{\arg \; {\max_{\overset{\sim}{A}}{{{tr}\left( {{\overset{\sim}{Z}}^{T}\overset{\sim}{A}} \right)}\mspace{14mu} {s.t.\mspace{11mu} \overset{\sim}{A}}}}} \right. \in {\left\{ {{Find}\mspace{14mu} {worst}\mspace{14mu} {violator}} \right\}}}$$\mspace{11mu} \left. {6\text{:}\mspace{31mu} {if}} \middle| {{{tr}\left( {{\overset{\sim}{Z}}^{T}\overset{\sim}{A}} \right)} - {{tr}\left( {{\overset{\sim}{Z}}^{T}A} \right)}} \middle| {\geq {k\mspace{14mu} {then}}} \right.$$\mspace{11mu} {{{7\text{:}\mspace{45mu} {add}\mspace{14mu} {constraint}\mspace{14mu} {to}\mspace{14mu} \text{:}\mspace{14mu} {{tr}\left( {Z^{T}A} \right)}} - {{tr}\left( {Z^{T}\overset{\sim}{A}} \right)}} > {1 - \xi}}$  8:   end  if$\mspace{11mu} \left. {9\text{:}\mspace{14mu} {until}} \middle| {{{tr}\left( {{\overset{\sim}{Z}}^{T}\overset{\sim}{A}} \right)} - {{tr}\left( {{\overset{\sim}{Z}}^{T}A} \right)}} \middle| {\leq k} \right.$$10\text{:}\mspace{14mu} {return}\mspace{14mu} \overset{\sim}{M}$

For networks larger than a few hundred nodes or for high-dimensionalfeatures, these SDPs may not scale well. The complexity of the SDP mayscale with the number of variables and constraints, yielding aworst-case time of O(d³+C³) where C=O(n²). By temporarily omitting thePSD requirement on M, Algorithm 2 becomes equivalent to a one-classstructural support vector machine (structural SVM). Stochastic SVMalgorithms have been recently developed that have convergence time withno dependence on input size. Therefore, a large-scale algorithm based onprojected stochastic subgradient descent is developed. The proposedadaptation removes the dependence on n, where each iteration of thealgorithm is O(d²), sampling one random constraint at a time. Theoptimization can be rewritten as unconstrained over an objectivefunction with a hinge-loss on the structure preserving constraints:

$\begin{matrix}{{f(M)} = {{\frac{\lambda}{2}{M}_{F}^{2}} - {\frac{1}{S}{\sum\limits_{{({i,j,k})} \in S}\; {\max \left( {{{D_{M}\left( {x_{i},x_{j}} \right)} - {D_{M}\left( {x_{i},x_{k}} \right)} + 1},0} \right)}}}}} & (8)\end{matrix}$

Here the constraints have been written in terms of hinge-losses overtriplets, each consisting of a node, its neighbor and its non-neighbor.The set of all such triplets is S={(i,j,k)|A_(ij)=1, A_(ik)=0}. Usingthe distance transformation in Equation 1, each of the |S| constraintscan be written using a sparse matrix C^((i,j,k)), where

-   -   C_(jj) ^((i,j,k))=1, C_(ij) ^((i,j,k))=1, C_(ki) ^((i,j,k))=1,        C_(ij) ^((i,j,k))=−1, C_(ji) ^((i,j,k))=−1, C_(kk)        ^((i,j,k))=−1,        and whose other entries are zero. By construction, sparse matrix        multiplication of C^((i,j,k)) indexes the proper elements        related to nodes i, j, and k, such that tr(C^((i,j,k))X^(T)MX)        is equal to D_(M)(x_(i), x_(j))−D_(M)(x_(i), x_(k)). The        subgradient of f at M is then

$\begin{matrix}{{{\nabla f} = {{\lambda \; M} + {\frac{1}{S}{\sum\limits_{{({i,j,k})} \in S_{+}}\; {{XC}^{({i,j,k})}X^{T}}}}}},{where}} & (9) \\{S_{+} = \left\{ \left( {i,j,k} \right) \middle| {{{D_{M}\left( {x_{i},x_{j}} \right)} - {D_{M}\left( {x_{i},x_{k}} \right)} + 1} > 0} \right\}} & (10)\end{matrix}$

If for all triplets this quantity is negative, there exists nounconnected neighbor of a point which is closer than a point's farthestconnected neighbor—precisely the structure preserving criterion fornearest neighbor algorithms. In some embodiments this objective functionis optimized via stochastic subgradient descent. These embodimentssample a batch of triplets, replacing S in the objective function with arandom subset of S of size B. If a true metric is necessary, variousembodiments intermittently project M onto the PSD cone. Full detailsabout constructing the constraint matrices and minimizing the objectiveare shown in Algorithm 3.

Algorithm 3: Structure Preserving Metric Learning with Nearest NeighborConstraints and Optimization with Projected Stochastic SubgradientDescent

Input:  A ∈ ^(n × n), X ∈ ℝ^(d × n), and  p arameters  λ, T, B  1:  M₁ ← I_(d)  2:  for  t  from  1  to  T − 1  do$\mspace{11mu} \left. {3\text{:}\mspace{31mu} \eta_{t}}\leftarrow\frac{1}{\lambda \; t} \right.$  4:   C ← 0_(n, n)  5:   for  b  from  1  to  B  do  6:   (i, j, k) ← Sample  random  triplet  from  S = {(i, j, k)|A_(ij) = 1, A_(ik) = 0}  7:   if  D_(M_(t))(x_(i), x_(j)) − D_(M_(t))(x_(i), x_(k)) + 1 > 0  then  8:    C_(jj) ← C_(jj) + 1, C_(ik) ← C_(ik) + 1, C_(ki) ← C_(ki) + 1  9:    C_(ij) ← C_(ij) − 1, C_(ji) ← C_(ji) − 1, C_(kk) ← C_(kk) − 110:   end  if 11:   end  for12:   ∇_(t) ← XCX^(T) + λ M_(t)13:   M_(t + 1) ← M_(t) − η_(t)∇_(t)14:   Optional:  M_(t + 1) ← [M_(t + 1)]⁺{Project  onto  the  PSD  cone}15:  end  for 16:  return  M_(T)

Analysis

In this section, analysis for the scaling behavior of SPML using SGD isprovided. A significant insight is that, since Algorithm 3 regularizeswith the L₂ norm and penalizes with hinge-loss, omitting the positivesemidefinite requirement for M and vectorizing M makes the algorithmequivalent to a one-class, linear support vector machine with O(n³)input vectors. Thus, the stochastic optimization is an instance of thePEGAGOS algorithm, albeit a cleverly constructed one. The running timeof PEGASOS does not depend on the input size, and instead scales withthe dimensionality, the desired optimization error on the objectivefunction ε and the regularization parameter λ. The optimization error εis defined as the difference between the found objective value and thetrue optimal objective value, f({tilde over (M)})−min_(M)f(M).

-   -   Theorem 1: Assume that the data is bounded such that        max_((i,j,k)∈S)∥XC^((i,j,k))X^(T)∥_(F) ²≦R, and R≧1. During        Algorithm 3 at iteration T, with λ≦1/4, and batch-size B=1, let

$\overset{\_}{M} = {\frac{1}{T}{\sum_{t = 1}^{T}\; M_{t}}}$

be the average M so far. Then, with probability of at least 1−δ,

$\begin{matrix}{{{f\left( \overset{\_}{M} \right)} - {\min\limits_{M}\; {f(M)}}} \leq {\frac{84\; R^{2}{\ln \left( {T\text{/}\delta} \right)}}{\lambda \; T}.}} & (11)\end{matrix}$

Consequently, the number of iterations necessary to reach anoptimization error of

$\in {{is}\mspace{14mu} {{\overset{\sim}{O}\left( \frac{1}{\lambda \in} \right)}.}}$

Proof (Theorem 1): The theorem is proven by realizing that Algorithm 3is an instance of PEGASOS without a projection step on one-class data,since Corollary 2 in [S. Shalev-Shwartz, Y. Singer, N. Srebro, and A.Cotter. Pegasos: Primal estimated sub-gradient solver for SVM.Mathematical Programming. March 2011, Volume 127, Issue 1, pp 3-30]proves this same bound for traditional SVM input, also without aprojection step. The input to the SVM is the set of all d×d matricesXC^((i,j,k))X^(T) for each triplet (i,j,k)∈S.

Note that the large size of set S plays no role in the running time;each iteration requires O(d²) work. Assuming the node feature vectorsare of bounded norm, the radius of the input data R is constant withrespect to n, since each is constructed using the feature vectors ofthree nodes. In practice, as in the PEGASOS algorithm, variousembodiments use M_(T) as the output instead of the average, as doing somay perform better on real data, but an averaging version can beimplemented by storing a running sum of M matrices and dividing by Tbefore returning.

Graph 2(b) shows the training and testing prediction performance on oneof the experiments described in detail below as stochastic SPMLconverges. The area under the receiver operator characteristic (ROC)curve is measured, which is related to the structure preserving hingeloss, and the plot shows fast convergence and quickly diminishingreturns at higher iteration counts.

Variations of SPML

While stochastic SPML does not scale with the size of the input graph,evaluating distances using a full M matrix requires O(d²) work. Thus,for high-dimensional data, one exemplary approach is to use principalcomponent analysis or random projections to first reduce dimensionality.It has been shown that n points can be mapped into a space ofdimensionality O(log n/ε²) such that distances are distorted by no morethan a factor of (1±ε). Another exemplary approach is to limit M to benonzero only along the diagonal. Diagonalizing M reduces the amount ofwork to O(d).

If modeling cross-feature interactions is necessary, another option forreducing the computational cost is to perform SPML using a low-rankfactorization of M. In this case, all references to M can be replacedwith L^(T)L, thus inducing a true metric without projection. The updatedgradient with respect to L is simply

∇_(t)←2XCX^(T)L^(T)+λL_(t).  (12)

Using a factorization also allows replacing the regularizer with theFrobenius norm of the L matrix, which is equivalent to the nuclear normof M. Using this formulation causes the objective to no longer beconvex, but seems to work well in practice. Finally, when predictinglinks of new nodes, SPML does not know how many connections to predict.To address this uncertainty, a variant to SPML called degreedistributional metric learning (DDML) can be used, which simultaneouslylearns the metric as well as parameters for the connectivity algorithm.Details on DDML and low-rank SPML are discussed below.

Degree Distributional Metric Learning (DDML)

While SPML using k-nearest neighbors learns a structure preservingmetric, one of its limitations is in predicting full graphs in anout-of-sample setting. On training data, the degree of each node isknown, so the connectivity algorithm connects the exact number ofneighbors as necessary to reconstruct the input graph. On a new set ofnodes, however, the target degree is unknown. One method to address thisis to learn a non-stationary degree preference function over nodefeatures that relates the features of a node to its target degree.

As one possible variant to structure preserving metric learning (SPML),degree distributional metric learning (DDML) simultaneously and/orconcurrently learns a metric while also learning a parameterized,non-stationary degree preference function used to compute theconnectivity of nodes. This extension can be understood as SPML with anadaptive connectivity algorithm, rather than the default k-nearestneighbors.

The connectivity algorithm uses a degree preference function

, which takes a node's feature vector x. and a target degree k, and isparameterized by matrix S∈

^(d×n). The score is then computed via

$\begin{matrix}{{g\left( {\left. k \middle| x \right.;S} \right)} = {\sum\limits_{k^{\prime} = 1}^{k}\; {x^{T}{s_{k^{\prime}}.}}}} & (13)\end{matrix}$

The score of a graph A is then the sum of all edge distances and thedegree preference functions for each node

$\begin{matrix}{{F\left( {{\left. A \middle| X \right.;M},S} \right)} = {{\sum\limits_{ij}\; {A_{ij}{D_{M}\left( {x_{i},x_{j}} \right)}}} - {\sum\limits_{i}\; {{g\left( {\left. {\sum\limits_{j}\; A_{ij}} \middle| x_{i} \right.;S} \right)}.}}}} & (14)\end{matrix}$

The objective for DDML is otherwise analogous to that of SPML:

$\begin{matrix}{{{f(M)} = {{\frac{\lambda}{2}{M}^{2}} - {\sum\limits_{\overset{\sim}{A} \in B^{n \times n}}\; {\max \left( {{{F\left( {{\left. A \middle| X \right.;M},S} \right)} - {F\left( {{\left. \overset{\sim}{A} \middle| X \right.;M},S} \right)} + {\Delta \left( {A,\overset{\sim}{A}} \right)}},0} \right)}}}},} & (15)\end{matrix}$

where Δ denotes Hamming distance. In some embodiments, this objective issolvable via the cutting-plane style optimization by iteratively findingthe worst-violating Ã and adding it to a constraint set. For concavedegree preference functions, the worst-violated constraint can be foundby converting the problem to a maximum weight b-matching on an augmentedgraph, thus an additional concavity constraint on g is added to theoptimization.

In various embodiments, a similar approach to the stochastic SPMLalgorithm is also possible to perform DDML much faster, and, byparameterizing the degree preference function only up to a fixed maximumdegree, also eliminates the dependence of the running time on the sizeof the graph. As in stochastic SPML, a DDML objective can be written interms of triplets of nodes i, neighbor j, disconnected node triplets k.Let A^((i,j,k)) denote the false graph produced by toggling the edgebetween nodes i and j and the edge between nodes i and k. The DDMLobjective using the triplet-style constraints is

$\begin{matrix}{{f^{\deg}(M)} = {{\frac{\lambda}{2}{M}^{2}} - {\frac{1}{S}{\sum\limits_{{({i,j,k})} \in S}\; {{\max \left( {{{F\left( {{\left. A \middle| X \right.;M},S} \right)} - {F\left( {{\left. A^{({i,j,k})} \middle| X \right.;M},S} \right)} + 1},0} \right)}.}}}}} & (16)\end{matrix}$

The difference in scores decomposes into four scalar values, since theonly differences changing A to are that A^((i,j,k)) are that A^((i,j,k))is missing edge (i,j), gains edge (i,k), the degree of node j decreasesby one and the degree of node k increases by one. Thus, the differencecan be computed by evaluating the distance from node i to node j, thedistance from node i to node k, the change in degree preference scorefrom the degree of node j to its degree minus one, and the change indegree preference from the degree of node k from its degree plus one.Let the degrees of all nodes be stored in array c, such that the degreeof node j is c[j]. The difference is then computable as

F(A|X;M,S)−F(A ^((i,j,k)) /|;X;M,S)=D _(M)(x _(i) ,x _(j))−D _(M)(x _(i),x _(k))+x _(j) ^(T) s _((c[j]−1)) −x _(k) T s _((c[k]+1)).  (17)

This formulation eliminates the need for the expensive separation oracleand allows stochastic optimization. The gradient update for the metricparameter M is the same as in SPML. The gradient with respect tos_((c[j]−1)) is x_(j) and the gradient with respect to s_((c[k]+1)) is(−x_(k)).

To retain coherence between the different degree functions, arequirement that the resulting degree preference function for each nodeis concave can be added. In some embodiments concavity is enforced bystochastically sampling a node i per iteration, and projecting S suchthat entries in x_(i) ^(T)S are in decreasing order. The pseudocode forstochastic DDML is in Algorithm 4.

Algorithm 4 Stochastic Degree Distributional Metric Learning

Input:  A ∈ ^(n × n), X ∈ ℝ^(d × n), and  parameters  λ, T, B  1:  M₁ ← I_(d), S₁ ← 0_(d, n)  2:  Compute  degree  array  c  s.t.  c[i] = ∑_(j) A_(ij), ∀i  3:  for  t  from  1  to  T − 1  do$\mspace{11mu} \left. {4\text{:}\mspace{31mu} \eta_{t}}\leftarrow\frac{1}{\lambda \; t} \right.$  5:   C ← 0_(n, n)   6:   S^(′) ← λ S  7:   for  b  from  1  to  B  do  8:   (i, j, k) ← Sample  random  triplet  from  S = {(i, j, k)|A_(ij) = 1, A_(ik) = 0}  9:   if  F(A|X; M_(t), S_(t)) − F(A^((i, j, k))|X; M_(t), S_(t)) + 1 > 0  then10:    C_(jj) ← C_(jj) + 1, C_(ik) ← C_(ik) + 1, C_(ki) ← C_(ki) + 111:    C_(ij) ← C_(ij) − 1, C_(ji) ← C_(ji) − 1, C_(kk) ← C_(kk) − 112:    s_(c[j])^(′) ← s_(c[j])^(′) + x_(j)13:    s_(c[k])^(′) ← s_(c[k])^(′) − x_(k) 14:   end  if15:   end  for 16:   ∇_(t) ← XCX^(T) + λ M_(t)17:   M_(t + 1) ← M_(t) − η_(t)∇_(t)18:   S_(t + 1) ← S_(t) − η_(t)S^(′)19:   i ← Sample  random  index20:   Project  S  so  X_(i)^(T)  S  is  monotonically  nonincreasing21:   Optional:  M_(t + 1) ← [M_(t + 1)]⁺{Project  onto  the  PSD  cone}22:  end  for 23:  return  M_(T)

Low-Rank Structure Preserving Metric Learning

The low-rank variant of SPML computes all distances using afactorization L∈

^(r×d) of M=L^(T)L, eliminating the need to compute a d×d matrix. Someexisting metric learning algorithms use similar low-rank factorizations.Low-rank SPML has an additional parameter r, which limits the rank of Mby explicitly determining the size of L. The optional projection ontothe PSD cone is no longer necessary because L^(T)L always forms a validmetric by construction. This optimization is not convex, but initialexperimental results seem to show that the stochastic optimizationavoids local minima in practice. Algorithm 5 details the steps oflow-rank SPML.

Algorithm 5 Low-Rank Structure Preserving Metric Learning with NearestNeighbor Constraints and Optimization with Projected StochasticSubgradient Descent

Input:  A ∈ ^(n × n), X ∈ ℝ^(d × n), and  parameters  λ, T, B, r   1:  L₁ ← rand(r, d){Initialize  L}  2:  for  t  from  1  to  T − 1  do$\mspace{11mu} \left. {3\text{:}\mspace{31mu} \eta_{t}}\leftarrow\frac{1}{\lambda \; t} \right.$  4:   C ← 0_(n, n)  5:   for  b  from  1  to  B  do  6:   (i, j, k) ← Sample  random  triplet  from  S = {(i, j, k)|A_(ij) = 1, A_(ik) = 0}  7:   ifL_(t)x_(i) − L_(t)x_(j)² − L_(t)x_(i) − L_(t)x_(k)² + 1 > 0  then  8:    C_(jj) ← C_(jj) + 1, C_(ik) ← C_(ik) + 1, C_(ki) ← C_(ki) + 1  9:    C_(ij) ← C_(ij) − 1, C_(ji) ← C_(ji) − 1, C_(kk) ← C_(kk) − 110:   end  if 11:   end  for12:   ∇_(t) ← 2XCX^(T)L_(t)^(T) + λ L_(t)13:   L_(t + 1) ← L_(t) − η_(t)∇_(t)14:  end  for15:  return  L_(T)

SPML Experiments

A variety of synthetic and real-world experiments are described belowthat elucidate the behavior of SPML. SPML performance is shown on asimple synthetic dataset that is easily visualized in two dimensions andwhich we believe mimics many traditional network datasets. Favorableperformance for SPML is also shown in predicting links of the Wikipediadocument network and the Facebook social network.

Synthetic Example

To better understand the behavior of SPML, consider the followingsynthetic experiment. First n points are sampled from a d-dimensionaluniform distribution. These vectors represent the true features for then nodes X∈

^(d×n). An adjacency matrix is computed by performing a minimum-distanceb-matching on X. Next, the true features are scrambled by applying arandom linear transformation: RX where R∈

^(d×d). Given RX and A, the goal of SPML is to learn a metric M thatundoes the linear scrambling, so that when b-matching is applied to RXusing the learned distance metric, it produces the input adjacencymatrix.

FIG. 18 illustrates the results of the above experiment for d=2, n=50,and b=4. In FIG. 18( a), we see an embedding of the graph using the truefeatures for each node as coordinates, and connectivity generated fromb-matching. In FIG. 18( b), the random linear transformation has beenapplied. We posit that many real-world datasets resemble plot 1(b), withseemingly incongruous feature and connectivity information. Applyingb-matching to the scrambled data produces connections shown in FIG. 18(c). Finally, by learning M via SPML (Algorithm 2) and computing L byCholesky decomposition of M, features LRX can be recovered (FIG. 18( d))that respect the structure in the target adjacency matrix and thus moreclosely resemble the true features used to generate the data.

FIG. 18 illustrates that in this synthetic experiment, SPML finds ametric that inverts the random transformation applied to the features(b), such that under the learned metric (d) the implied connectivity isidentical to the original connectivity (a) as opposed to inducing adifferent connectivity (c).

Link Prediction

SPML can be compared to a variety of methods for predicting links fromnode features: Euclidean distances, relational topic models (RTM), andtraditional support vector machines (SVM). A simple baseline forcomparison is how well the Euclidean distance metric performs at rankingpossible connections. Relational topic models learn a link probabilityfunction in addition to latent topic mixtures describing each node. Forthe SVM, training examples are constructed consisting of the pairwisedifferences between node features. Training examples are labeledpositive if there exists an edge between the corresponding pair ofnodes, and negative if there is no edge. Because there are potentiallyO(n²) possible examples, and the graphs are sparse, we subsample thenegative examples so that we include a randomly chosen equal number ofnegative examples as positive edges. Without subsampling, the SVM isunable to run the experiments in a reasonable time. The SVMPerfimplementation for SVM in T. Joachims. Training linear SVMs in lineartime. In ACM SIG International Conference On Knowledge Discovery andData Mining (KDD), pages 217-226, 2006, and the authors' code for RTM inJ. Chang and D. Blei. Hierarchical relational models for documentnetworks. Annals of Applied Statistics, 4:124-150, 2010 were used.

Interestingly, an SVM with these inputs can be interpreted as aninstance of SPML using diagonal M and the f-neighborhood connectivityalgorithm, which connects points based on their distance, completelyindependently of the rest of the graph structure. Therefore, SPML isexpected to product better performance in cases where the structure isimportant. The RTM approach may be appropriate for data that consists ofcounts, and is a generative model which recovers a set of topics inaddition to link predictions. Despite the generality of the model, RTMdoes not seem to perform as well as discriminative methods in ourexperiments, especially in the Facebook experiment where the data isquite different from bag-of-words features. For SPML, the stochasticalgorithm is run with batch size 10. The PSD projection step is skipped,since these experiments are only concerned with prediction, andobtaining a true metric is not necessary. SPML is implemented in MATLABand requires only a few minutes to converge for each of the experimentsbelow.

FIG. 19 illustrates the average ROC performance for the “graph theorytopics” Wikipedia experiment (left) shows a strong lift for SPML overcompeting methods. We see that SPML converges quickly with diminishingreturns after many iterations (right).

Wikipedia Articles

SPML is applied to predicting links on Wikipedia pages. Imagine thescenario where an author writes a new Wikipedia entry and then, byanalyzing the word counts on the newly written page, a prediction systemis able to suggest which other Wikipedia pages it should link to. First,a few subnetworks are created consisting of all the pages in a givencategory, their bag-of-words features, and their connections. Threecategories are chosen: “graph theory topics”, “philosophy concepts”, and“search engines”. A word dictionary of common words is used withstop-words removed. For each network, the data is split 80/20 fortraining and testing, where 20% of the nodes are held out forevaluation. On the remaining 80% the test cross-validates (five folds)over the parameters for each algorithm (RTM, SVM, SPML), and trains amodel using the best-scoring regularization parameter. For SPML, thediagonal variant of Algorithm 3 is used, since the high-dimensionalityof the input features reduces the benefit of cross-feature weights. Onthe held-out nodes, each algorithm is tasked to rank the unknown edgesaccording to distance (or another measure of link likelihood), andcompare the accuracy of the rankings using receiver operatorcharacteristic (ROC) curves. Table 1 lists the statistics of eachcategory and the average area under the curve (AUC) over threetrain/test splits for each algorithm. A ROC curve for the “graph theory”category is shown in FIG. 19( a). For “graph theory” and “searchengines”, SPML provides a distinct advantage over other methods, whileno method has a particular advantage on “philosophy concepts”. Onepossible explanation for why the SVM is unable to gain performance overEuclidean distance is that the wide range of degrees for nodes in thesegraphs may make it difficult to find a single threshold that separatesedges from non-edges. In particular, the “search engines” category hadan extremely skewed degree distribution, and is where SPML shows thegreatest improvement.

SPML is also applied to a larger subset of the Wikipedia network, bycollecting word counts and connections of 100,000 articles in abreadth-first search rooted at the article “Philosophy”. Theexperimental setup is the same as previous experiments, but a 0.5%sample of the nodes is used for testing. The final training algorithmran for 50,000 iterations, taking approximately ten minutes on a desktopcomputer. The resulting AUC on the edges of the held-out nodes is listedin Table 1 as the “Philosophy Crawl” dataset. The SVM and RTM do notscale to data of this size, whereas SPML offers a clear advantage overusing Euclidean distance for predicting links.

Facebook Social Networks

Applying SPML to social network data allows prediction systems to moreaccurately predict who will become friends based on the profileinformation for those users. The Facebook data used includes a smallsubset of anonymized profile information for each student of auniversity, as well as friendship information. The profile informationconsists of gender, status (meaning student, staff, or faculty), dorm,major, and class year. Similarly to the Wikipedia experiments in theprevious section, SPML is compared to Euclidean, RTM, and SVM. For SPML,a full M is learned via Algorithm 3. For each person, a sparse featurevector is constructed where there is one feature corresponding to everypossible dorm, major, etc. for each feature type. Only people who haveindicated all five feature types on their profiles are selected. Table 1shows details of the Facebook networks for the four schools we consider:Harvard, MIT, Stanford, and Columbia. A separate experiment is performedfor each school, randomly splitting the data 80/20 for training andtesting. The training data is used to select parameters via five-foldcross validation, and train a model. The AUC performance on the held-outedges is also listed in Table 1. It is clear from the quantitativeresults that structural information is contributing to higherperformance for SPML as compared to other methods.

TABLE 1 Wikipedia (top), Facebook (bottom) dataset and experimentinformation. Shown below: number of nodes n, number of edges m,dimensionality d, and AUC performance. n m d Euclidean RTM SVM SPMLGraph Theory 223 917 6695 0.624 0.591 0.610 0.722 Philosophy Concepts303 921 6695 0.705 0.571 0.708 0.707 Search Engines 269 332 6695 0.6620.487 0.611 0.742 Philosophy Crawl 100,000 4,489,166 7702 0.547 — —0.601 Harvard 1937 48,980 193 0.764 0.562 0.839 0.854 MIT 2128 95,322173 0.702 0.494 0.784 0.801 Stanford 3014 147,516 270 0.718 0.532 0.7840.808 Columbia 3050 118,838 251 0.717 0.519 0.796 0.818

FIG. 20 provides a comparison of Facebook social networks from fourschools in terms of feature importance computed from the learnedstructure preserving metric.

By looking at the weight of the diagonal values in M normalized by thetotal weight, it can be determined which feature differences are mostimportant for determining connectivity. FIG. 20 shows the normalizedweights averaged by feature types for Facebook data. FIG. 20 shows thefeature types compared across four schools. For all schools except MIT,the graduating year is most important for determining distance betweenpeople. For MIT, dorms are the most important features. A possibleexplanation for this difference is that MIT is the only school in thelist that makes it easy for students to stay in a residence for all fouryears of their undergraduate program, and therefore which dorm one livesin may affect more strongly the people they connect to.

These SPML experiments demonstrate a fast convex optimization forlearning a distance metric from a network such that the distances aretied to the network's inherent topological structure. The structurepreserving distance metrics introduced in this article allow us tobetter model and predict the behavior of large real-world networks.Furthermore, these metrics are as lightweight as independent pairwisemodels, but capture structural dependency from features making them easyto use in practice for link-prediction. SPML's lack of dependence ongraph size can be used to learn a structure preserving metric onmassive-scale graphs, e.g., the entire Wikipedia site. Since eachiteration requires only sampling a random node, following a link to aneighbor, and sampling a non-neighbor, this can all be done in an onlinefashion as the algorithm crawls a network such as the worldwide web,learning a metric that may gradually change over time.

DDML Experiments

Using DDML on the same Wikipedia experiments described above, DDMLscores comparable AUC to SPML. On “graph theory”, “philosophy concepts”,and “search engines”, DDML scores AUCs of 0.691, 0.746, and 0.725. Whilethese scores are quite close to those of SPML, the DDML variant providesa tradeoff between running time and model richness. In the case of theWikipedia category “philosophy concepts”, DDML even provides aperformance improvement, which may indicate a clear signal in degreepreference learnable from the word counts.

FIG. 21 illustrates a ROC curve for various algorithms on the“philosophy concepts” category.

Low Rank SPML Experiments

Low-rank SPML is run on the Harvard Facebook data, fixing λ=1e-5 andvarying the rank parameter r. The ROC curves and AUC scores usingtraining data for different ranks are in Graph 5. With greater rank,SPML has more flexibility to construct a metric that fits the trainingdata, but lower rank provides a tradeoff between efficiency andreconstruction quality. It is clear from this dataset that a rank of r=5is sufficient to represent the structure preserving metric, whilereducing the number of parameters from d²=37,249 to d×r=965. Trainingfewer parameters requires less time, and allows low-rank SPML to handlelarge-scale networks with many nodes and high-dimensional features.

FIG. 22 illustrates the performance of low-rank SPML on training datavarying the rank parameter, run on a single Facebook school. The resultsimply that a significantly smaller rank than the true featuredimensionality is sufficient to fit the training data.

In summary, DDML is an extension of SPML that learns degree preferencefunctions, which are used in addition to the learned distances topredict a graph. DDML aims to learn a richer model than SPML, yet uses acomparable learning algorithm which also can learn from large-scaleinput.

FIG. 1A is a block diagram of an exemplary embodiment of a networksystem 150 with an SPML/DDML link prediction system 152 according tosome embodiments of the disclosed subject matter. System 150 can includean SPML/DDML link prediction system 152 and network data 156. Thenetwork data 156 can include a plurality of nodes 166 and 188, each ofwhich can include features (or properties) 170 and 172, and nodelinks/connections 164. The SPML/DDML link prediction system 152 canreceive data from and transmit data to a user terminal 154. Inoperation, the SPML/DDML link recommender component 152 can receive linkprediction requests from and transmit link predictions to the userterminal 154 according to the processes shown in FIGS. 1B and 2-7.

It will be appreciated that the network data 156 can be stored in adatabase system connected to the SPML/DDML link prediction system 152via a network. Optionally, the network data 156 can be stored locally inmemory attached to the prediction processing component 152.

FIG. 1B is a flowchart showing an exemplary method for using structurepreserving metric learning (SPML) connection prediction in a networkrecommender process. Processing begins at 102 and continued to 104.

At 104, a connection (also often characterized as a “link”) predictionrequest is received from a prediction requestor. The connectionprediction request can include information pertaining to a node 114 forwhich predicted connections are requested. The connection predictionrequest can, for example, be a request from a user of a social networksystem that has requested the social network system to recommend a listof new connections for the user, as shown in FIGS. 9 and 10. In anotherexample, a link prediction request can be a request from a user of adocument network system that has requested a list of new links betweenthe user's document and other relevant documents, as shown in FIGS. 11and 12. Additionally, the connection prediction request can, forexample, be a request from a user of a dating service system that hasrequested the dating service system to recommend a list of newconnections for the user, as shown in FIGS. 13 and 14. In anotherexample, the connection prediction request can be a request from a userof a shopping service system that has requested the shopping servicesystem to recommend a list of recommended products for the user. In eachof these examples, the connection prediction request can be initiated bya user of the systems, and/or the connection prediction request can beinitiated by the systems with or without interaction from a user. Forexample, a component of a social network system can generate, withoutinteraction from the user, a connection prediction request for one ormore users of the social network system to provide new connectionpredictions (or recommendations) to the user unsolicited by the user. Inanother example, a component of a social network system can generate aconnection prediction request in response to a user registering to join,logging into the social network system, and/or changing their profileinformation in the social network system, and provide the new connectionpredictions to the user without the user directly requesting newconnection predictions. Processing continues to 106.

At 106, SPML or DDML processing is performed to generate an output 124that can include a list of predicted connections, or links, 126. SPML orDDML processing is performed based the input 112 that can include thenode 114 indicated in the received request and a network 118 of whichthe node 114 is a member or the node 114 can be a new node that is notcurrently a member. The network 118 can include nodes 120 (each nodehaving properties or features that characterize each node respectively)and connections (links) between them 122.

As indicated at 116, the node data includes property data (or features)116 that provides characteristics of the node, for example,characteristics of a social network user. In a social networking systemthe node 114 represents the user and the node features 116 can includemany characteristics of the including but not limited to the user's age,sex, status, college, college major, college dorm, college graduationyear, etc. In the document network example, the node 114 can representthe document for which new links have been requested and the nodefeatures 116 can include but not limited to word counts, bag-of-wordsfeatures, and other document characteristics. Processing continues to108.

At 108 the predicted connections, or links, 126 are transmitted to theprediction requestor. The predicted connections can be transmitted tothe prediction requestor in a ranked list such that the first predictedconnection is, using the learned structure preserving distance metric,closer to the input node than the second predicted connection and so on.Optionally, class information can be transmitted to the predictionrequestor identifying the class or some other correlation that existsbetween the input node and each predicted connection which resulted inthe connection being predicted. Processing continues to 110, whereprocessing ends.

FIG. 2 is a flowchart showing an exemplary embodiment of a structurepreserving metric learning (SPML) connection prediction method 200according to some embodiments of the disclosed subject matter.Processing begins at 202 and continues to 204.

At 204, a connection prediction request is received from a predictionrequestor such as a social network user in conjunction with a socialnetwork service provider. The connection prediction request can indicatea node 214 for which predicted connections are requested. A connectionprediction request can, for example, be a request from a user of asocial network system that has requested the social network system torecommend a list of new connections for the user. In another example, alink prediction request can be a request from a user of a documentnetwork system that has requested a list of new links to other relevantdocuments. Processing continues to 206.

At 206, processing is performed based on an input 212 to generate anoutput 224 that can include a list of predicted connections, or links,226. The input 212 can include a structure preserving distance metric218 and the node 214 for which predicted connections were requested. Thenode 214 can belong to a network of nodes and connections, and SPML canbe used to learn the structure preserving distance metric 218 betweenthe nodes of the network. The node 214 can include node features 216.The structure preserving distance metric 218, the node 214, and the nodefeatures 216 can be used to generate the list of predicted connections,or links, 226. Processing continues to 208.

At 208 the predicted connections, or links, 226 are transmitted to theprediction requestor. The predicted connections can be transmitted tothe prediction requestor in a ranked list such that the first predictedconnection is, using the learned structure preserving distance metric,closer to the input node than the second predicted connection and so on.Optionally, class information can be transmitted to the predictionrequestor identifying the class or some other correlation that existsbetween the input node and each predicted connection which resulted inthe connection being predicted. Processing continues to 210, whereprocessing ends.

FIG. 3 is a diagram of an exemplary embodiment of a structure preservingmetric learning (SPML) prediction system according to some embodimentsof the disclosed subject matter. System 300 can include a laptop usercomputer 302, a desktop user computer 304, a smartphone user computer306 and a web server 310. The laptop user computer 302, desktop usercomputer 304, and smartphone user computer 406 can transmit data toand/or receive data from the web server 310 via a network 308.

In operation, a user operating the laptop user computer 302, desktopuser computer 404, and/or smartphone user computer 306 can, via a webbrowser, send a request to the web server 310.

The user request can, for example, include a request to join a socialnetworking site and receive a list of recommended connections, or arequest for an existing user of the social networking site to receive alist of recommended new connections. In this example, the web server 310can, given the user's profile information and/or features, generate alist of predicted new connections for the user according to the SPML orDDML methods provided herein. The SPML/DDML enabled web server 310 can,in this example, transmit the list of predicted new connections to therequesting user via the network 308.

In another example the request can include a request to submit a newdocument to an online document network and receive a list of recommendedlinks for the new article, or a request to receive recommended new linksfor an existing document in the document network. In this example, theweb server 310 can, given the document's word count, bag-of-words,and/or document features, generate a list of predicted new linksrelevant to the document according to the SPML or DDML methods providedherein. The web server 310 can, in this example, transmit the list ofpredicted new links to the requesting user via the network 308.

FIG. 4 is a flowchart showing an exemplary method of SPML/DDML linkprediction 400 according to some embodiments of the disclosed subjectmatter. Processing begins at 402 and continues to 404.

At 404, network data including node properties and node links is storedon a data store accessible by a link prediction processor. For example,the network data can be stored in a database server and the linkprediction processor can be a computer network server that can, forexample, access the database server via a network. The network data can,for example, represent social networks such as Facebook, MySpace, andsimilar networks, dating service networks such as eHarmony, Match.com,and similar networks, document networks such as Wikipedia, and similarnetworks, and shopping networks such as Amazon.com and similar networks,as described in FIGS. 9-16. Processing continues to 406 or optionallycontinues concurrently or sequentially to 406 and 408.

At 406, the link prediction processor learns a structure preservingdistance metric by performing a structure preserving metric learningprocess such as one of the SPML or DDML implementations discussed above,such as, but not limited to, Stochastic DDML or cutting plane DDML.

Optionally, processing can concurrently or sequentially continue to 408where the link prediction processor can learn a degree predictionfunction. For example, the link prediction server can perform 406 and408 concurrently by performing one of the DDML implementations to learna structure preserving distance metric and a degree prediction functionconcurrently.

Processing continues to 410. At 410, a request for new link predictionsfor a specified node with node properties is received from a linkprediction requestor. The specified node can be a new node not alreadyrepresented in the network data or an existing node. Processingcontinues to 412.

At 412, new links are predicted for the node specified in the requestbased on the requested node properties, the learned structure preservingdistance metric, and optionally the learned degree prediction function.If 408 is not performed and the degree prediction function is notlearned, a predetermined number of new links can be predicted for eachrequested node. The predicted new links can be transmitted to the linkprediction requestor in a ranked list such that the first predicted newlink node is, under the learned structure preserving distance metric,closer to the specified node than the second predicted new link node andso on. Optionally, class information can be transmitted to the linkprediction requestor identifying the class and/or some other correlationthat exists between the specified node and each predicted new link whichresulted in the connection being predicted. Processing continues to 414.

At 414, the predicted new links are transmitted to the link predictionrequestor. Processing continues to 416 where processing ends.

It will be appreciated that the link prediction requestor can, forexample, be an end user of a social network service, a document networkservice, a dating service, or a shopping service, or any other similarservice. It will also be appreciated that the link prediction requestorcan, for example, be an internal component of any of these services thatcan request link predictions for any of its users and provide thepredicted links to its users with or without a user having to initiatesuch a request. For example, any of these services can include aregistration component that upon a new user registering for the serviceautomatically submits a new link prediction request and presents the newlink predictions to the user without the user having to submit a request(see, for example, FIGS. 9, 11, 13, and 15). In another example, any ofthese services can periodically or upon a change in node properties(e.g. user profile change in a social network or document edits appliedin a document network service) submit a new link prediction request andtransmit the new link predictions to the user without the user having tosubmit a request. In these examples, the user can receive linkpredictions from the services and submit a request for new linkpredictions via, for example, a website and the new link predictionresults can be provided to the user through the website or via anelectronic messaging service such as e-mail or instant messaging.

It will also be appreciated that the method can be repeated in whole orin part. For example, 406 and optionally 408 can be repeated to maintaincurrent learned distance metrics and degree prediction functions aschanges to the stored network data occur over time (such as nodeproperties and node links changing over time, such as, for example, whena user in a social network service updates their profile or adds/removesfriends).

FIG. 5 is a flowchart showing an exemplary method of SPML/DDML linkprediction 500 according to some embodiments of the disclosed subjectmatter. Processing begins at 502 and continues to 504.

At 504, network data, similar to that described in FIG. 4 above, isprovided including node properties and node links. Processing continuesto 506.

At 506, a learned structure preserving distance metric and optionally alearned degree preference function are provided. Processing continues to508.

At 508, a request for new link predictions for a specified node withnode properties is received from a link prediction requestor. Thespecified node can be a new node not already represented in the networkdata or an existing node. Processing continues to 510.

At 510, new links are predicted for the node specified in the requestbased on the requested node properties, the learned structure preservingdistance metric, and optionally the learned degree prediction function.If the degree prediction function is not provided, a predeterminednumber of new links can be predicted for each requested node. Processingcontinues to 512.

At 512, the predicted new links are transmitted to the link predictionrequestor. The predicted new links can be transmitted to the linkprediction requestor in a ranked list such that the first predicted newlink node is, under the learned structure preserving distance metric,closer to the specified node than the second predicted new link node andso on. Optionally, class information can be transmitted to the linkprediction requestor identifying the class and/or some other correlationthat exists between the specified node and each predicted new link whichresulted in the connection being predicted. Processing continues to 514where processing ends.

FIG. 6 is a flowchart showing an exemplary method of DDML link degreeprediction 600 according to some embodiments of the disclosed subjectmatter. Processing begins at 602 and continues to 604.

At 604, network data including node properties and node links is storedon a data store accessible by a link prediction processor, as describedabove in FIG. 4. Processing continues to 606.

At 606, a degree prediction function is learned for the network dataaccording to one of the DDML processes described above, such as, forexample, Stochastic DDML or cutting plane DDML. Processing continue to608.

At 608, a request to predict the degree of a specified node given itsnode properties is received from degree prediction requestor. Processingcontinues to 610.

At 610, a predicted degree for the specified node is generated based onthe specified node's properties using the learned degree preferencefunction according to one of the DDML processes described above. Thepredicted degree can, for example, be in the form of a probability thatthe specified node will have a specified degree. Processing continues to612.

At 612, the predicted degree is transmitted to the degree predictionrequestor. Processing continues to 614 where processing ends.

It will be appreciated that the method 600 can be repeated in whole orin part to, for example, maintain a current learned degree preferencefunction as changes occur in the network data (such as changes in thenode properties or node links changing, for example, when a user of asocial network service updates their profile or adds/removes friends).For example, 606 can be repeated periodically or upon a change in thenetwork data to maintain a current learned degree preference function.

FIG. 7 is a flowchart showing an exemplary method of SPML/DDML linkprediction using network partitioning 700 according to some embodimentsof the disclosed subject matter. Processing begins at 702 and continuesto 704.

At 704, network data including node properties and node links is storedon a data store accessible by a link prediction processor, as describedabove in FIGS. 4 and 6. Processing continues to 606.

At 706, the network data is partitioned. Partitioning the network datacan, for example, be performed to allow SPML/DDML processes, such as theDDML cutting plane optimization, to be run on smaller segments, orpartitions, of the network, so that these processes can be utilized withlarge networks. In this example, by partitioning large networks intosmaller segments, SPML/DDML processes described above, such as thecutting plane optimization can be performed on the smaller networkpartitions. As indicated elsewhere, natural partitions may arise due tobarriers to linking, for example, training data from different schools.Processing continues 708.

At 708, a structure preserving distance metric is learned by performinga structure preserving metric learning process, such as one of the SPMLor DDML implementations described above (e.g. the DDML cutting planeoptimization), for each of the partitions created in 706. Optionally, adegree preference function can be learned for each partition. Forexample, when using the DDML cutting plane optimization on eachpartition a structure preserving distance metric and degree preferencefunction can be learned concurrently for each partition. Processingcontinues to 710.

At 710, a request for new link predictions for a specified node withnode properties is received from a link prediction requestor. Thespecified node can be a new node not already represented in the networkdata or an existing node. Processing continues to 412.

At 712, at least one of the partitions created in 706 is selected basedon the specified node's properties. Partition selection can also accountfor the specified node's existing links if the specified node is anexisting node in the network data. Processing continues to 714.

At 714, new links are predicted for the node specified in the requestbased on the partitions selected in 712, the requested node properties,the learned structure preserving distance metric, and optionally thelearned degree prediction function. If the degree prediction function isnot learned, a predetermined number of new links can, for example, bepredicted for each requested node. Processing continues to 716.

At 716, the predicted new links are transmitted to the link predictionrequestor. The predicted new links can be transmitted to the linkprediction requestor in a ranked list such that the first predicted newlink node is, under the learned structure preserving distance metric,closer to the specified node than the second predicted new link node andso on. Optionally, class information can be transmitted to the linkprediction requestor identifying the class and/or some other correlationthat exists between the specified node and each predicted new link whichresulted in the connection being predicted (which can include anindication of the partition used for link prediction). Processingcontinues to 718 where processing ends.

It will be appreciated that the partitioning of the network data can beperformed in various ways depending on the type of network representedby the network data. For example, in a data service network, the networkdata can, for example, be partitioned geographically under the premisethat those users in the same geographic area are more likely to belinked and recommended for dates than those that are geographicallyremote.

It will also be appreciated that partitioning the network allows forparallelization of the learning performed at 708, and learning acrosseach partition can be distributed across link prediction processorcomponents, as described in FIG. 8.

It will also be appreciated that the link prediction request describedabove in FIGS. 4, 6, and 7 can, in some embodiments, specify a pluralityof nodes. In such embodiments, links can be predicted among only thespecified nodes to create a new network among those nodes, or links canbe predicted among the specified nodes and the existing nodes in thenetwork data.

FIG. 8 is a block diagram of an exemplary embodiment of a distributedstructure preserving metric learning (SPML/DDML) link prediction system800 according to some embodiments of the disclosed subject matter.System 800 can include network data 804 that can be partitioned into aplurality of network partitions 826, 828, and 830, and a link predictionprocessor 802 which can include a plurality of link predictionprocessing components 808, 810, and 812. Each network partition caninclude nodes with properties 832, 836, and 840, and node links 834,838, an 842. System 800 can also include a link prediction requestor 806that can transmit data to and receive data from the link predictionprocessor 802. The link prediction processor 802 can transmit data toand receive data from the plurality of link prediction processingcomponents 808, 810, and 812, each of which can be configured to accessa partition of the network data 826, 828, and 830.

In operation, the plurality of link prediction processing components808, 810, and 812 can learn concurrently or in parallel a structurepreserving metric from their respective network partitions 826, 828,and/or 830, according to the method described in FIG. 7. The linkprediction processor 802 can receive a link prediction request from thelink prediction requestor 806, process the request to predict links forthe specified node(s)/user(s), and transmit the predicted links to thelink prediction requestor 806, according to the method described abovein FIG. 7.

FIG. 9 is a block diagram of a system for predicting friendships to newusers of a social network using SPML/DDML according to some embodimentsof the disclosed subject matter. In particular, the system 900 includesa social network service provider 902 that is coupled to an SPML/DDMLlink prediction system that can include a structure preservingdistance-metric learning component 903 and a degree prediction functionlearning component 926. The social network service provider 902 is alsocoupled to an electronic data storage having stored therein datarepresenting a plurality of social network members (906-908) each havinga respective set of properties/features or profile information (910-912)and a respective set of friendship information (920-922). The socialnetwork provider 902 receives the profile information (910-912) andfriendship information (920-922) from one or more respective users(906-908). In response to the received profile information (910-912) andfriendship information (920-922), the social network provider 902performs SPML/DDML link prediction using the SPML/DDML link predictionsystem 924 including the structure preserving distance-metric learningcomponent 903 and the degree prediction function learning component 926to predict friendships for users based on their profile information. TheSPML/DDML link prediction system 924 can, for example, learn a structurepreserving distance metric using the structure preservingdistance-metric learning component 903 and learn a degree predictionfunction using the degree prediction function learning component 926, asdescribed in FIGS. 4-7 where the profile information (910-912) istreated as node properties and the friendship information (920-922) asnode links. For example, a new user 928 can register to join the socialnetwork and the social network provider 902 receives the new user 928and the new user's profile information 930. In response to the new user928, the social network provider 902 can predict friendships to the newuser 928 based on the new user's profile information 930 using theSPML/DDML link prediction system 924. The predicted new friendships canbe communicated to the user for their approval (e.g., each user mayreceive an email listing the new predicted friendships or be directed toa web page listing the new predicted friendships). For example, aresults set 932 (e.g., in an email or displayed on the user's page atthe social network site) can be provided for the new member 928. Withinthe results are listed the new links 934 selected to match the newmember 928. The predicted new links 934 can be provided to the new userin a ranked list such that the first predicted new link node is, underthe learned structure preserving distance metric, closer to the new userthan the second predicted new link node and so on. Optionally, classinformation can be transmitted to the new user identifying the classand/or some other correlation that exists between the new user and eachpredicted new link which resulted in the connection being predicted.

In this example and in the example provided in FIG. 10, the nodes of thenetwork data include the members of the social network service. The nodeproperties include member profile information and the links includefriendships between members of the social network service.

It will be appreciated that the social network provider 902 can, inaddition to providing new user 928 with the list of predicted newfriends 934, also provide the new user 928 as a predicted new friend tothose existing users in the list of predicted new friends 934, forexample, via an email message or through a message on the social networkwebsite.

FIG. 10 is a block diagram of a system for predicting friendshipsbetween users of a social network using SPML/DDML according to someembodiments of the disclosed subject matter. In particular, the system1000 includes the social network service provider 902, SPML/DDML linkprediction system 924, structure preserving distance-metric learningcomponent 903, degree prediction function learning component 926,electronic data storage, plurality of social network members (906-908)each having a respective set of properties/features or profileinformation (910-912) and a respective set of friendship information(920-922), as described above in FIG. 9. System 1000 also includes fortwo users of the social network site, x and y, predicted friendshipresults 1002-1004 that each include a plurality of predictedfriendships. The social network provider 902 receives updates from usersto modify their social network member data such as their profileinformation and existing friendships, when, for example, a user adds anew friend or changes their profile information. In response to thesechanges, the social network provider 902 can perform SPML/DDML linkprediction using the SPML/DDML link prediction system 924 to predict newfriendships for users based on their changed profile information. Forexample, the social network provider 902 can create predicted newfriendships 1006 and 1008 when users x and y update their profileinformation and/or add/drop friends and provide the results 1002 and1004 to users x and y (e.g, in an email or displayed on the user's pageat the social network site).

In another example the social network provider 902 can perform SPML/DDMLlink prediction using the SPML/DDML link prediction system 924 topredict new friendships for users periodically or on-demand.

FIG. 11 is a block diagram of a system for predicting links to newdocuments added to an information network using SPML/DDML according tosome embodiments of the disclosed subject matter. In particular, thesystem 1100 includes an information network service provider 1102 thatis coupled to an SPML/DDML link prediction system 1124 that can includea structure preserving distance-metric learning component 1103 and adegree prediction function learning component 1126. The informationnetwork service provider 1102 is also coupled to an electronic datastorage having stored therein data representing a plurality ofinformation network documents (1106-1108) each having a respective setof document properties (1110-1112) including bag-of-words containingwords occurring in the document and a respective set of citation and/orlink information (1120-1122). The information network provider 1102receives the document properties (1110-1112) and citation and/or linkinformation (1120-1122) from one or more respective documents(1106-1108). In response to the received document properties (1110-1112)and citation and/or link information (1120-1122), the informationnetwork provider 1102 performs SPML/DDML link prediction using theSPML/DDML link prediction system 1124 including the structure preservingdistance-metric learning component 1103 and the degree predictionfunction learning component 1126 to predict citations/links fordocuments based on their document properties including bag-of-wordsinformation. The SPML/DDML link prediction system 1124 can, for example,learn a structure preserving distance metric using the structurepreserving distance-metric learning component 1103 and learn a degreeprediction function using the degree prediction function learningcomponent 1126, as described in FIGS. 4-7 where the document properties(1110-1112) is treated as node properties and the citation and/or linkinformation (1120-1122) as node links. For example, a new document 1128can be submitted to the information network and the information networkprovider 1102 receives the new document 1128 and the new document'sproperties 1130. In response to the new document 1128, the informationnetwork provider 1102 can predict links/citations to other relevantdocuments for the new document 1128 based on the new document'sproperties 1130 including its bag-of-words using the SPML/DDML linkprediction system 1124. The predicted new links/citations can becommunicated to the author or submitter of the new document 1128 fortheir approval (e.g., each user may receive an email listing the newpredicted links/citations or be directed to a web page listing the newpredicted links/citations). For example, a results set 1116 (e.g., in anemail or displayed on the author's or submitter's page at theinformation network site) can be provided for the new document 1128.Within the results are listed the new links/citations 934 predicted tomatch the new document 1128.

In this example and in the example provided in FIG. 12, the nodes of thenetwork data include the documents in the information network service.The node properties include document properties including bag-of-wordsand the links include links/citations between documents of theinformation network service.

It will be appreciated that the information network provider 1102 can,in addition to providing new user 1128 with the list of predicted newfriends 1118, also provide the new user 1128 as a predicted new friendto those existing users in the list of predicted new friends 1118, via,for example, an email message or a message on the information networkwebsite.

FIG. 12 is a block diagram of a system for predicting links betweendocuments in an information network using SPML/DDML according to someembodiments of the disclosed subject matter. In particular, the system1200 includes the information network service provider 1102, SPML/DDMLlink prediction system 1124, structure preserving distance-metriclearning component 1103, degree prediction function learning component1126, electronic data storage, plurality of information networkdocuments (1106-1108) each having a respective set of documentproperties (1110-1112) and a respective set of links/citationsinformation (1120-1122), as described above in FIG. 11. System 1200 alsoincludes for two documents of the information network site, x and y,predicted links/citations results 1204-1206 that each include aplurality of predicted links/citations 1202, 1208. The informationnetwork provider 1102 receives updates from users/authors to modifytheir document data such as document properties and existinglinks/citations, when, for example, a document adds a new link/citationor modifies the content of the document. In response to these changes,the information network provider 1102 can perform SPML/DDML linkprediction using the SPML/DDML link prediction system 1124 to predictnew links/citation for documents based on their modified documentproperties. For example, the information network provider 1102 cancreate predicted new links/citations 1202 and 1208 when the content ofdocuments x and y are modified, when their document properties aremodified, when links/citations for the documents are added/removed,and/or for some other event, and provide the results 1206 and 1204 tousers x and y (e.g, in an email or displayed on the user's page at theinformation network site).

FIG. 13 is a block diagram of a system for predicting connections to newmembers joining a dating service using SPML/DDML according to someembodiments of the disclosed subject matter. In particular, the system1300 includes a dating service provider 1302 that is coupled to anSPML/DDML link prediction system 1324 that can include a structurepreserving distance-metric learning component 1303 and a degreeprediction function learning component 1326. The social network serviceprovider 1302 is also coupled to an electronic data storage havingstored therein data representing a plurality of dating service members(1306-1308) each having a respective set of properties/features orprofile information (1310-1312) and a respective set of positiveconnection information (1320-1322). The dating service provider 902receives the profile information (1310-1312) and positive connectioninformation (920-922) from one or more respective users (1306-1308). Thepositive connection information (920-922) can include communicationsinitiated by a user with another user, or any other positive interactionbetween users such as dates, communications, or the like. In response tothe received profile information (1310-1312) and positive connectioninformation (1320-1322), the dating service provider 1302 performsSPML/DDML link prediction using the SPML/DDML link prediction system1324 including the structure preserving distance-metric learningcomponent 1303 and the degree prediction function learning component1326 to predict new connections for users based on their profileinformation. The SPML/DDML link prediction system 1324 can, for example,learn a structure preserving distance metric using the structurepreserving distance-metric learning component 1303 and learn a degreeprediction function using the degree prediction function learningcomponent 1326, as described in FIGS. 4-7 where the profile information(1310-1312) is treated as node properties and the positive connectioninformation (1320-1322) as node links. For example, a new user 1328 canregister to join the dating service and the dating service provider 1302receives the new user 1328 and the new user's profile information 1330.In response to the new user 1328, the dating service provider 1302 canpredict new connections to the new user 1328 based on the new user'sprofile information 1330 using the SPML/DDML link prediction system1324. The predicted new connections can be communicated to the user fortheir review (e.g., each user may receive an email listing the newpredicted connections or be directed to a web page listing the newpredicted connections). For example, a results set 1316 (e.g., in anemail or displayed on the user's page at the dating service website) canbe provided for the new member 1328. Within the results are listed thenew links 1334 selected to match the new member 1328.

In this example and in the example provided in FIG. 14, the nodes of thenetwork data include the members of the dating service. The nodeproperties include member profile information and the links includepositive connections established between members of the dating service.

It will be appreciated that the dating service provider 1302 can, inaddition to providing new user 1328 with the list of predicted newconnections 1318, also provide the new user 1328 as a predicted newconnections to those existing users in the list of predicted new friends1318, via, for example, an email message or a portion of the datingservice website.

FIG. 14 is a block diagram of a system for predicting connectionsbetween members in a dating service using SPML/DDML according to someembodiments of the disclosed subject matter. In particular, the system1400 includes the dating service provider 1302, SPML/DDML linkprediction system 1324, structure preserving distance-metric learningcomponent 1303, degree prediction function learning component 1326,electronic data storage, plurality of dating service members (1306-1308)each having a respective set of profile information (1310-1312) and arespective set of connections (1320-1322), as described above in FIG.13. System 1400 also includes for two members of the dating servicesite, x and y, predicted new connections results (1402-1404) that eachinclude a plurality of predicted new connections (1406-1408). Variousevents can trigger the dating service provider 1302 to predict newconnections, such as when the dating service provider 1302 receivesupdates from members to modify their profile information and/or whenmembers update their connection information, and/or other dating serviceevents. In response to these changes, the dating service provider 1302can perform SPML/DDML link prediction using the SPML/DDML linkprediction system 1324 to predict new connections for members based on,for example, their modified profile information or connections. Forexample, the dating service provider 1302 can create predicted newconnections 1406 and 1408 when the profile information of members x andy are modified, and/or when connections for the members areadded/removed, and provide the results 1406 and 1404 to users x and y(e.g, in an email or displayed on the user's page at the dating servicewebsite).

FIG. 15 is a block diagram of a system for recommending products to newusers of a shopping service using SPML/DDML according to someembodiments of the disclosed subject matter. In particular, the system1500 includes a shopping service provider 1502 that is coupled to anSPML/DDML link prediction system 1524 that can include a structurepreserving distance-metric learning component 1503 and a degreeprediction function learning component 1526. The shopping serviceprovider 1502 is also coupled to an electronic data storage havingstored therein data representing a plurality of shopping service users(1506-1508) each having a respective set of properties/features or userprofile and activity/browsing history (1510-1512) and a respective setof links to products based on past purchase events (1520-1522). The datastorage also having stored therein data representing a plurality ofshopping service products (1532-1534) each having a respective set ofproduct features (1536-1538) and a respective set of links to usersbased on past purchase events (1540-1542). The shopping service provider1502 receives the user profile and activity/browsing history (1510-1512)and links to products based on past purchase events (1520-1522) from oneor more respective users (1506-1508). In response to the user profileand activity/browsing history (1510-1512) and links to products based onpast purchase events (1520-1522), the shopping service provider 1502performs SPML/DDML link prediction using the SPML/DDML link predictionsystem 1524 including the structure preserving distance-metric learningcomponent 1503 and the degree prediction function learning component1526 to recommend products for users based on their user profile and/oractivity/browsing history. The SPML/DDML link prediction system 1524can, for example, learn a structure preserving distance metric using thestructure preserving distance-metric learning component 1503 and learn adegree prediction function using the degree prediction function learningcomponent 1526, as described in FIGS. 4-7 where the user profile andactivity/browsing history (1510-1512) and product features (1536-1538)are treated as node properties and the links to products based on pastpurchase events (1520-1522) and links to users based on past purchaseevents (1540-1542) are treated as node links. For example, a new user1528 can register to join shopping network and the shopping serviceprovider 1502 receives the new user 1528 and the new user's profileinformation 1530. In response to the new user 1528, the shopping serviceprovider 1502 can recommend products to the new user 1528 based on thenew user's profile and/or activity/browsing history 1530 using theSPML/DDML link prediction system 1524. The recommended products can becommunicated to the user for possible purchase (e.g., each user mayreceive an email listing the recommended products or be directed to aweb page listing the recommended products). For example, a results set1516 (e.g., in an email or displayed on the user's page at the shoppingservice website) can be provided for the new user 1528. Within theresults are listed the recommended products 1518 selected to match thenew user 1528.

In this example and in the example provided in FIG. 16, the nodes of thenetwork data include the users and products of the shopping networkservice. The node properties include user profile (e.g. gender, age,address, etc.) and activity/browsing history and product features. Thenode links are between users and products and are determined by purchaseevents.

FIG. 16 is a block diagram of a system for recommending products tousers of a shopping service using SPML/DDML according to someembodiments of the disclosed subject matter. In particular, the system1600 includes the dating service provider 1502, SPML/DDML linkprediction system 1524, structure preserving distance-metric learningcomponent 1503, degree prediction function learning component 1526,electronic data storage, plurality of shopping service members(1506-1508) each having a respective set of profile andactivity/browsing history (1510-1512) and a respective set of links toproducts based on past purchases (1520-1522), and a plurality ofshopping service products (1532-1534) each having a respective set ofproduct features (1536-1538) and a respective set of links to usersbased on past purchases (1540-1542), as described above in FIG. 15.System 1600 also includes for two users of the shopping service site, xand y, recommended new product results (1602-1604) that each include aplurality of recommended products (1606-1608). Various events cantrigger the shopping service provider 1502 to predict new links betweenusers and products, such as when a user's user activity/browser historyhas changed, and/or when purchases are made, and/or other shoppingservice events. In response to these changes, the shopping serviceprovider 1502 can perform SPML/DDML link prediction using the SPML/DDMLlink prediction system 1524 to predict new product recommendations formembers based on, for example, their modified profile information orpurchases. For example, the shopping service provider 1502 can recommendnew products 1606 and 1608 when the profile information of members x andy are modified, when their user activity/browser history is modified,and/or when purchases are made, and provide the results 1602 and 1604 tousers x and y (e.g, in an email or displayed on the user's page at theshopping service website).

It will be appreciated that each of the social network, dating service,information network, and shopping service discussed above can beInternet based and provide a website for interaction between the serviceand its users/members. Users can connect to the servers over any type ofnetwork device including but not limited to a desktop computer, a laptopcomputer, a tablet, a web enabled cell phone, etc.

FIG. 17 is a block diagram of an exemplary embodiment of a structurepreserving distance-metric learning link prediction system 1700according to some embodiments of the disclosed subject matter. System1700 can include a computer 1702 that can include a processor 1704 and amemory 1706. The computer 1702 can transmit data to and receive datafrom a data store 1708. The computer 1702 can transmit data to andreceive data from a link prediction requestor 1708.

In operation, the processor 1704 will execute instructions stored on thememory 1706 that cause the computer 1702 to access network data from thedata store 1708 to perform SPML/DDML link prediction in response toreceiving a link prediction request from the link prediction requestor1710 according to the processes shown in FIGS. 1B and 2-7.

Note that network data may include points that are inevitablydisconnected from other points. For example, network may be availablerepresenting friend networks for different schools. In such data, thelack of links between points in different schools lacks information fortraining the distance metric. However, both sets may be used to train asingle metric. Thus, it will be apparent how the above algorithms may bemodified to account for this disconnectedness in the training data.Further, networks may contain inherent resistances or amplifiers thataffect the likelihood of a link being realized. In addition, some linksmay indicate a stronger affinity than others. For example, links formedacross inconvenient geographic distances or which endure for longerperiods of time may be weighted more strongly in the optimization of thedistance metric.

In any of the above-described, or the below-claimed embodiments, inaddition to generating recommended or proposed links (relationships,connections, friendships, transactions, depending on the type ofnetwork) the method or system may also store the proposed link and usethat new link in further processing for new nodes or proposed nodes. Forexample, when a social network system recommends a friendship and atransaction is detected confirming the relationship, such as thedetection of a transaction such as an email exchange, the method orsystem may incorporate the new link into the network and do additionalprocessing based on the presence of the link. The incorporation of thelink in the network may include the storage of new profile data if thelink is associated with a new node.

It will be appreciated that the data store 1708 may be attached to thesystem using any network connection type, or alternatively the networkdata store 1710 can directly attached to the system.

In any of the disclosed embodiments, including the claims, where asingle computer or processor is recited, in alternative embodiments morethan one computer or processor may be used, for example to process datain parallel. In the foregoing embodiments and in the claims, the termlearning identifies training process, for example, one involvingoptimization of a distance metric based on link data. In any of theembodiments, the link terms such as link, relationship, transaction, areused in the various embodiments to identify of connections betweenobject, persons, entities, or other things, and which may be representedas a network in a computer data store.

It will be appreciated that according to the above-described, or thebelow-claimed embodiments a trained (or learned) metric allows for thegeneration of a ranked list of predicted connections between one or morenew or target nodes to other nodes, the ranking being by distance asmeasured by the learned metric. In some embodiments where the degreepreference function is not provided, a predetermined value may be usedto determine the number of predicted connections to provide from theranked list. Alternatively, in some other embodiments where the degreepreference function is not provided, the number of predicted connectionsprovided may be specified by the user (e.g. the user can specify howmany predicted connection to provide) or determined according to a ruleresponsive to the new or target node properties (e.g. profile data) orinferred from other data indicating user activity on other networks(e.g. when a user joins one social network such as Facebook, the numberof links to be predicted could be determined based on the user'sproperties and/or links existing in other social network such asGoogle+, which the social network being joined could access using publicdata without needing the new user's authorization or using theauthorization of the user the social network being joined could accessthe user's private profile and/or link data in the other socialnetwork).

It will also be appreciated that the modules, processes, components,systems, and sections described above can be implemented in hardware,hardware programmed by software, software instruction stored on anon-transitory computer readable medium or a combination of the above.For example, a method for indicating a maintenance operation can beimplemented, for example, using a processor configured to execute asequence of programmed instructions stored on a non-transitory computerreadable medium. For example, the processor can include, but not belimited to, a personal computer or workstation or other such computingsystem that includes a processor, microprocessor, microcontrollerdevice, or is comprised of control logic including integrated circuitssuch as, for example, an Application Specific Integrated Circuit (ASIC).The instructions can be compiled from source code instructions providedin accordance with a programming language such as Java, C++, C#.net orthe like. The instructions can also comprise code and data objectsprovided in accordance with, for example, the Visual Basic™ language,LabVIEW, or another structured or object-oriented programming language.The sequence of programmed instructions and data associated therewithcan be stored in a non-transitory computer-readable medium such as acomputer memory or storage device which may be any suitable memoryapparatus, such as, but not limited to read-only memory (ROM),programmable read-only memory (PROM), electrically erasable programmableread-only memory (EEPROM), random-access memory (RAM), flash memory,disk drive and the like.

Furthermore, the modules, processes, systems, and sections can beimplemented as a single processor or as a distributed processor.Further, it should be appreciated that the steps mentioned above may beperformed on a single or distributed processor (single and/ormulti-core). Also, the processes, modules, and sub-modules described inthe various figures of and for embodiments above may be distributedacross multiple computers or systems or may be co-located in a singleprocessor or system. Exemplary structural embodiment alternativessuitable for implementing the modules, sections, systems, means, orprocesses described herein are provided below.

The modules, processors or systems described above can be implemented asa programmed general purpose computer, an electronic device programmedwith microcode, a hard-wired analog logic circuit, software stored on acomputer-readable medium or signal, an optical computing device, anetworked system of electronic and/or optical devices, a special purposecomputing device, an integrated circuit device, a semiconductor chip,and a software module or object stored on a computer-readable medium orsignal, for example.

Embodiments of the method and system (or their sub-components ormodules), may be implemented on a general-purpose computer, aspecial-purpose computer, a programmed microprocessor or microcontrollerand peripheral integrated circuit element, an ASIC or other integratedcircuit, a digital signal processor, a hardwired electronic or logiccircuit such as a discrete element circuit, a programmed logic circuitsuch as a programmable logic device (PLD), programmable logic array(PLA), field-programmable gate array (FPGA), programmable array logic(PAL) device, or the like. In general, any process capable ofimplementing the functions or steps described herein can be used toimplement embodiments of the method, system, or a computer programproduct (software program stored on a non-transitory computer readablemedium).

Furthermore, embodiments of the disclosed method, system, and computerprogram product may be readily implemented, fully or partially, insoftware using, for example, object or object-oriented softwaredevelopment environments that provide portable source code that can beused on a variety of computer platforms. Alternatively, embodiments ofthe disclosed method, system, and computer program product can beimplemented partially or fully in hardware using, for example, standardlogic circuits or a very-large-scale integration (VLSI) design. Otherhardware or software can be used to implement embodiments depending onthe speed and/or efficiency requirements of the systems, the particularfunction, and/or particular software or hardware system, microprocessor,or microcomputer being utilized. Embodiments of the method, system, andcomputer program product can be implemented in hardware and/or softwareusing any known or later developed systems or structures, devices and/orsoftware by those of ordinary skill in the applicable art from thefunction description provided herein and with a general basic knowledgeof ventilation control and/or computer programming arts.

Moreover, embodiments of the disclosed method, system, and computerprogram product can be implemented in software executed on a programmedgeneral purpose computer, a special purpose computer, a microprocessor,or the like.

It is, thus, apparent that there is provided, in accordance with thepresent disclosure, systems, methods, and devices for enhancing thevalue of network based systems. Many alternatives, modifications, andvariations are enabled by the present disclosure. Features of thedisclosed embodiments can be combined, rearranged, omitted, etc., withinthe scope of the invention to produce additional embodiments.Furthermore, certain features may sometimes be used to advantage withouta corresponding use of other features. Accordingly, Applicants intend toembrace all such alternatives, modifications, equivalents, andvariations that are within the spirit and scope of the presentinvention.

Embodiments of the disclosed subject matter can include a method forgenerating proposed recommendations (or predictions) for newrelationships (or links) in a social network and directing an outputfrom at least one computer network server to a terminal connectedthereto by a computer network. Node properties (or profiles) and linkscan be stored on a data store accessible by the at least one computernetwork server. Each profile can be a data set containingcharacteristics of a respective one of a plurality of persons and eachlink can be a data set that corresponds to a relationship of apredefined type between one of the plurality of persons to linked one ofthe plurality of persons such that some of the plurality of persons arelinked to first persons and unlinked to second persons, whereby eachlink corresponds to a linked pair of persons. The totality of links candefine a network. The method can include, using at least one computernetwork server, programmatically training a classifier based on distancemetrics, each distance metric characterizing a respective one of thelinked pairs. The distance metric can be responsive to outside linkswhich are links other than the respective one of the linked pairs, suchthat the totality of links can be derived from the classifier based onthe profiles without the links. Data corresponding to a new person notlinked to any other persons links in the network can be received and anew profile representing the new person can be generated. This data canbe received when a new user registers to join the social network and thesocial network can recommend/predict to the new user connection toexisting users. The method can include, using the classifier, generatingpredicted links responsively to the new profile and outputting dataresponsive to the predicted links.

In some such embodiments the method can also include receivingrelationship data from the plurality of persons and generating a newlink responsive thereto, wherein the relationship data include dataindicating at least one communication event between persons joined bythe new link. For example, when users of a dating service networkcommunicate with each other.

In some such embodiments the method can also include receivingrelationship data from the plurality of persons and generating a newlink responsive thereto, wherein the relationship data include dataindicating a command received from a respective one of the plurality ofpersons to be connected to another of the plurality of persons. Forexample, when users of a social network “friend” each other to form aconnection or link.

In some such embodiments the method can also include receivingrelationship data from the plurality of persons and generating a newlink responsive thereto, wherein the relationship data include dataindicating a common class to which persons joined by the new linkbelong. The common class can include any or all of a family, a schoolclass, membership in a club, a common employer, common vocation orhobby, a geographic distance between residences of the persons joined bythe new link. The common class can be responsive to transaction datareceived by the at least one computer network server, and thetransactions can represent transactions between persons joined by thenew link. The transactions can include communication transactions andcommercial transactions between persons joined by the new link.

Embodiments of the disclosed subject matter can include computerreadable mediums each containing program instructions for causing the atleast one computer network server and/or a processor to implement one ormore of any of the various methods described herein.

Embodiments of the disclosed subject matter can include a method forrecommending a new relationship for network members. The method caninclude storing profile data characterizing each of the network membersaccording to predefined features of the each of the members. The methodcan also include storing relationship data that defines the presence ofpredefined relationships among the network members based on dataindicating transactions between the network members and/or data provideda priori to indicate the existence of a relationship, the relationshipthereby defining links between the network members. A request can bereceived, at a network server, from a client of the network server, fora prediction for a target member of a new relationship that is notpresent in the relationship data. The method can include, at the networkserver, predicting, for the target member, the new relationship,responsively to profile data characterizing the target member andresponsively to relationship (or link) data defining relationships (orlinks) among network members.

Embodiments of the disclosed subject matter can include a method forgenerating product recommendations. The method can include receiving, ata computer network server, profile data, and transaction data indicatingtransactions of shoppers using a shopping web site. The profile data cancharacterize features of the shoppers (such as but not limited to age,gender, address, etc.). The profile data can also including features ofproducts offered by shopping web site. The method can include storinglink data representing links, each link defining an association betweena respective one of the shoppers and a product with respect to which theshopper performed a transaction (such as a purchase and/or adding theproduct to the user's shopping cart or a wish list indicating aninterest in the product). A classifier can be trained (or learned) basedon the link data and new product recommendation data can be generatedfor current shoppers using the shopping site based on the classifier andprofile data characterizing the features of the current shoppers.

Embodiments of the disclosed subject matter can include a method forgenerating proposed link recommendations for output to requestingprocesses running on one or more processor devices connected to at leastone computer network server through a connecting computer network. Themethod can include storing, on a data store that is accessible by the atleast one computer network server, profiles and links, each profile ofthe profiles being a data set containing characteristics of a respectiveone of a plurality of entities, each link of the links being a data setthat corresponds to a relationship of a predefined type between one ofthe plurality of entities to linked one of the plurality of entitiessuch that some of the plurality of entities are linked to respectivefirst entities and not linked to second entities, whereby each linkcorresponds to a linked pair of entities, the totality of links definingan relationship network. A classifier can be programmatically trained(or learned) based on distance metrics, each distance metriccharacterizing a respective one of the linked pairs, wherein thedistance metric is responsive to links other than ones corresponding tothe linked pair; the classifier being such that at least a substantialextent of a totality of the links can be derived from the classifierresponsively to the profiles without the information content of thelinks, whereby the trained (or learned) classifier contains all thestructural information of the extent of the relationship network. Themethod can also include receiving a profile corresponding to a newentity and generating at least one link representing the new entity.

In some such embodiments the generating can include, using theclassifier to estimate a structure of a new network that includes thenew entity including predicting a number of the at least one link. Forexample, by using SPLM or DDML to learn a structure preservingclassifier and, optionally, a degree preference function.

Embodiments of the disclosed subject matter can include a computerizedmethod for predicting links between nodes in a network using a computingdevice. The method can include storing data representing node propertiesin a data storage device accessible by a processor. Links between thenodes can be stored in the data storage device. Each node property canrepresent a characteristic of a person, a document, an event, web site,or other thing. Each link can represent a relationship between nodes,whereby the links define a relationship network. A classifier can begenerated (or learned) from the relationship links and the nodeproperties using a structure preserving method adapted to, whenso-generated (or learned), reproduce substantially all of the links fromthe node properties, whereby the classifier substantially preserves astructure defined by the links. A link prediction request can bereceived from a prediction requestor, the link prediction requestspecifying an input node having input node properties. A plurality ofnew links can be predicted for the input node responsively to the inputnode properties and the learned distance metric. The method can alsoinclude transmitting the predicted plurality of new links to theprediction requestor.

Embodiments of the disclosed subject matter can include a computerizedmethod for predicting the degree of a node in a network using acomputing device. The method can include storing network datarepresenting node properties and links between the nodes in a datastorage device accessible by a processor, each node propertyrepresenting a characteristic of a person, a document, an event, website, or other thing, and each link representing a relationship betweennodes, the aggregate properties and links defining a network. A degreeprediction function can be generated (or learned) from the network dataincluding the node properties and the links between the nodes using astructure preserving process. The degree prediction function can besubstantially structure preserving, and the degree prediction functioncan substantially predict the degrees of the nodes based on the nodeproperties. A degree prediction request can be received from aprediction requestor, the degree prediction request specifying an inputnode having input node properties. A degree prediction can be predictedfor the input node responsively to the input node properties and thedegree prediction function. The method can include transmitting thedegree prediction to the prediction requestor.

Embodiments of the disclosed subject matter can include a computerizedmethod for learning a structure preserving distance metric for anexisting network to predict connectivity of a new network using acomputing device. The method can include providing existing network dataaccessible by a processor, the existing network data representing nodeproperties and links between the nodes. Each node property can representa characteristic of a person, a document, an event, web site, or otherthing, and each link can represent a relationship between the thingrepresented by the node, the aggregate properties and links defining anexisting network. A learned distance metric can be generated (orlearned) from the existing network data including the node propertiesand the links between the nodes using a structure preserving process.The learned distance metric can be substantially structure preserving;the learned distance metric can substantially recreate the links betweenthe nodes when used by a connectivity algorithm to recreate links in theexisting network based on the node properties. A network predictionrequest can be received from a prediction requestor, the networkprediction request specifying a set of input nodes, each having inputnode properties. A plurality of new links can be predicted between theset of input nodes responsively to the input node properties and thelearned distance metric. The method can include transmitting thepredicted plurality of new links to the prediction requestor.

Embodiments of the disclosed subject matter can include a computerizedmethod for predicting links between users in an online social networkusing a computing device. The method can include storing network datarepresenting user properties and links between the users in a datastorage device accessible by a processor. Each user property canrepresent a characteristic of the user, and each link can represent arelationship between users, the aggregate properties and links defininga network. A learned distance metric and a degree predicting functioncan be generated (or learned) from the network data including the userproperties and the links between the users using a structure preservingprocess. The learned distance metric can be substantially structurepreserving; the learned distance metric and degree predicting functioncan substantially recreate the links between the users when used by aconnectivity algorithm to recreate links in the network based on theuser properties. A link prediction request can be received from aprediction requestor, the link prediction request specifying an inputuser having input user properties. A plurality of new links can bepredicted for the input user responsively to the input user propertiesand the learned distance metric. The method can include transmitting thepredicted plurality of new links to the prediction requestor.

Embodiments of the disclosed subject matter can include a computerizedmethod for learning a structure preserving distance metric and a degreepredicting function from a network. The method can include providingnetwork data accessible by a processor, the network data representingnode properties and observed links between the nodes. Each node propertycan represent a characteristic of a person, a document, an event, website, or other thing, and each observed link can represent arelationship between the thing represented by the node, the aggregateproperties and observed links defining a network. The method can includegenerating (or learning) a learned distance metric and degree predictingfunction from the network data including the node properties and theobserved links using a structure preserving process. The learneddistance metric can be substantially structure preserving; the learneddistance metric can substantially recreate the observed links when usedby a connectivity algorithm with the degree predicting function topredict links in the network based on the node properties.

Embodiments of the disclosed subject matter can include a computerizedmethod for predicting links between nodes in a network using a computingdevice. The method can include storing network data representing nodeproperties and links between the nodes in a data storage deviceaccessible by a processor, each node property representing acharacteristic of a person, a document, an event, web site, or otherthing, and each link representing a relationship between the thingrepresented by the node, the aggregate properties and links defining anetwork. A learned distance metric can be generated (or learned) fromthe network data including the node properties and the links between thenodes using a structure preserving process. The learned distance metriccan be substantially structure preserving; the learned distance metriccan substantially recreate the links between the nodes when used with aconnectivity algorithm to recreate links in the network based on thenode properties. The method can include receiving a link predictionrequest from a prediction requestor, the link prediction requestspecifying an input node having input node properties and a plurality ofinput node links. A plurality of new links can be predicted for theinput node responsively to the node, the learned distance metric, andthe learned degree preference function. The method can includetransmitting the predicted plurality of new links to the predictionrequestor.

Embodiments of the disclosed subject matter can include a computerizedmethod for predicting links between nodes in a network using a computingdevice. The method can include storing network data representing nodeproperties and links between the nodes in a data storage deviceaccessible by a processor, each node property representing acharacteristic of a person, a document, an event, web site, or otherthing, and each link representing a relationship between the thingrepresented by the node, the aggregate properties and links defining anetwork. The method can also include providing a distance metric learnedfrom the network data including the node properties and the linksbetween the nodes using a structure preserving process. The learneddistance metric can be substantially structure preserving; the learneddistance metric can substantially recreate the links between the nodeswhen used with a connectivity algorithm to recreate links in the networkbased on the node properties. A link prediction request can be receivedfrom a prediction requestor, the link prediction request specifying aninput node having input node properties and a plurality of input nodelinks. The method can include predicting a plurality of new links forthe input node responsively to the node, the learned distance metric,and the learned degree preference function. The predicted plurality ofnew links can be transmitted to the prediction requestor.

Embodiments of the disclosed subject matter can include a computerizedmethod for valuing relationships between entities according to theirrespective descriptions using a computing device. The method can includestoring a list of links and feature vectors in a digital data storeaccessible to a processor. A predictor can be trained (or learned),using the processor, from a list of links and feature vectors, eachcharacterizing a node linked by the links, the predictor being atrainable nonlinear classifier. The predictor can be effective forgenerating a distance estimate from the feature vectors of a pair ofnodes. The training can tune a metric so that it, based on therespective feature vectors, estimates a shorter distance for linked onesof the at least three and a further distance for unlinked ones of the atleast three feature vectors for all the feature vectors to produce atrained predictor. The method can include, using the trained predictor,estimating distances between pairs of nodes at least one of whose nodeswas not used to train the link predictor. The method can also includeoutputting selected ones of the estimated distances from the estimating.

Embodiments of the disclosed subject matter can include a computerizedmethod for predicting new links in a network. The network can be, forexample, a social network, a dating service network, a shopping network,or any other type of network. The method can include accessing networkdata from a data store. The network data can include nodes and links,the nodes each having properties characterizing each node and the linkseach representing a connection between two of the nodes, the nodes andlinks comprising a network. For example, the nodes can be users of asocial network each having profile information as node properties andeach user establishing friendships or connections with other users ofthe social network which can be represented by the links. The method caninclude learning a classifier for predicting new links in the network,which includes learning a Mahalanobis distance metric M for the networkand applying one or more linear constraints on M. The linear constraintsapplied on M can be configured to enforce the structure of the networkto be preserved in M. A link prediction request can be received from aprediction requestor, the request indicating a target node having targetproperties. For example, the link prediction requestor can be a userregistering for a social network for the first time and requesting thatthe social network provide predicted or recommended links to the user toestablish friendships or connection with other users of the socialnetwork. In another example, the prediction requestor can be a componentof the network (e.g. social network) configured to provide predictedlinks to its users at periodic intervals or in response to certain useractions (such as a registering to join the social network, changingtheir user profile, etc.). The method can include predicting one or morenew links for the target node responsive to the target node propertiesby applying a connectivity algorithm to the target node and the networknodes using the learned classifier including the learned distance metricM. The method can also include transmitting the one or more predictednew links to the prediction requestor.

Embodiments of the disclosed subject matter can include a computerizedmethod for making recommendations to users. The method can includereceiving at a receiving computer, from a requesting computer, a requestindicative of a proposal for a joining entity to join a network, thenetwork representing relationships between networked entities anddefined by network data stored in a computer accessible data store. Thenetwork data can include feature data characterizing networked entitiesand link data indicating relationships between respective pairs of thenetworked entities. The relationships can include transactions,affinities, friendships, common classes to which the entities includingbusinesses or other organizations, people, countries, types, animals orother living things, or anything else that may be characterizable by anetwork. The user can submit the request through a website and therequest can be in the form of an HTTP request. The method can includeaccessing the network data at the receiving computer or one or moreprocessing computers in communication with the receiving computer andgenerating a message responsive to a ranking of possible relationshipsbetween the joining entity and the networked entities. The ranking canbe responsive to feature data characterizing the joining entity. Thegenerating can be by a computational process such that, if the joiningentity feature data were identical to the feature data of one of thenetworked entities, the relationships of the one of the networkedentities stored in the network data would be of identical ranking. Theresponsive message can include or be included within an HTTP responseprovided to the user in response to the user's HTTP request.

1-74. (canceled)
 75. A method for generating proposed linkrecommendations for output to requesting processes running on one ormore processor devices connected to at least one computer network serverthrough a connecting computer network, comprising: storing, on a datastore that is accessible by the at least one computer network server,profiles and links, each profile of the profiles being a data setcontaining characteristics of a respective one of a plurality ofentities, each link of the links being a data set that corresponds to arelationship of a predefined type between one of the plurality ofentities to linked one of the plurality of entities such that some ofthe plurality of entities are linked to respective first entities andnot linked to second entities, whereby each link corresponds to a linkedpair of entities, the totality of links defining an relationshipnetwork; the at least one computer network server programmaticallytraining a classifier based on distance metrics, each distance metriccharacterizing a respective on of the linked pairs, wherein the distancemetric is responsive to links other than ones corresponding to thelinked pair the classifier being such that at least a substantial extentof a totality of the links can be derived from the classifierresponsively to the profiles without the information content of thelinks, whereby the trained classifier contains all the structuralinformation of the extent of the relationship network; and receiving aprofile corresponding to a new entity and generating at least one linkrepresenting the new entity.
 76. The method of claim 75, wherein thenumber of the at least one link is read from a data store storingpredetermined data.
 77. The method of claim 75, wherein the generatingincludes, using the classifier to estimate a structure of a new networkthat includes the new entity including predicting a number of the atleast one link.
 78. The method of claim 75, wherein the entities aremembers of a social network.
 79. The method of claim 78, wherein thesocial network is generated by a dating web site.
 80. The method ofclaim 75, wherein the entities are students, the links representfriends, and the new entity is an incoming student for whom there are nolinks.
 81. A non-transitory computer-readable medium comprisinginstructions stored thereon that, when executed by a processor, causethe process to implement a method for generating proposed linkrecommendations for output to requesting processes running on one ormore processor devices connected to at least one computer network serverthrough a connecting computer network, the method comprising: storing,on a data store that is accessible by the at least one computer networkserver, profiles and links, each profile of the profiles being a dataset containing characteristics of a respective one of a plurality ofentities, each link of the links being a data set that corresponds to arelationship of a predefined type between one of the plurality ofentities to linked one of the plurality of entities such that some ofthe plurality of entities are linked to respective first entities andnot linked to second entities, whereby each link corresponds to a linkedpair of entities, the totality of links defining an relationshipnetwork; programmatically training, by the at least one computer networkserver, a classifier based on distance metrics, each distance metriccharacterizing a respective on of the linked pairs, wherein the distancemetric is responsive to links other than ones corresponding to thelinked pair the classifier being such that at least a substantial extentof a totality of the links can be derived from the classifierresponsively to the profiles without the information content of thelinks, whereby the trained classifier contains all the structuralinformation of the extent of the relationship network; and receiving aprofile corresponding to a new entity and generating at least one linkrepresenting the new entity.
 82. The non-transitory computer-readablemedium of claim 81, wherein the number of the at least one link is readfrom a data store storing predetermined data.
 83. The non-transitorycomputer-readable medium of claim 81, wherein the generating includes,using the classifier to estimate a structure of a new network thatincludes the new entity including predicting a number of the at leastone link.
 84. The non-transitory computer-readable medium of claim 81,wherein the entities are members of a social network.
 85. Thenon-transitory computer-readable medium of claim 84, wherein the socialnetwork is generated by a dating web site.
 86. The non-transitorycomputer-readable medium of claim 81, wherein the entities are students,the links represent friends, and the new entity is an incoming studentfor whom there are no links.
 87. A computer network server comprising: aprocessor; and a data store; wherein the processor is configured to:store, on the data store, profiles and links, each profile of theprofiles being a data set containing characteristics of a respective oneof a plurality of entities, each link of the links being a data set thatcorresponds to a relationship of a predefined type between one of theplurality of entities to linked one of the plurality of entities suchthat some of the plurality of entities are linked to respective firstentities and not linked to second entities, whereby each linkcorresponds to a linked pair of entities, the totality of links definingan relationship network; programmatically train a classifier based ondistance metrics, each distance metric characterizing a respective on ofthe linked pairs, wherein the distance metric is responsive to linksother than ones corresponding to the linked pair the classifier beingsuch that at least a substantial extent of a totality of the links canbe derived from the classifier responsively to the profiles without theinformation content of the links, whereby the trained classifiercontains all the structural information of the extent of therelationship network; and receive a profile corresponding to a newentity; and generate at least one link representing the new entity. 88.The computer network server of claim 87, wherein the number of the atleast one link is read from a data store storing predetermined data. 89.The computer network server of claim 87, wherein the processor isfurther configured to use the classifier to estimate a structure of anew network that includes the new entity including predicting a numberof the at least one link.
 90. The computer network server of claim 87,wherein the entities are members of a social network.
 91. The computernetwork server of claim 90 wherein the social network is generated by adating web site.
 92. The computer network server of claim 87, whereinthe entities are students, the links represent friends, and the newentity is an incoming student for whom there are no links.